Hi,
we are using grib_api extensively here at FMI. According to latest
release notes of grib_api
(http://www.ecmwf.int/products/data/software/history/grib_api.html),
version 1.9.16 is thread safe. This is good news for us since we are in
the process of multithreading our application that converts gribs to our
own internal binary data format.
Now, the problem we are facing is that it seems like grib_api is not
thread safe after all (or at least that's how its seems from our
applications' perspective)! I'm not actually sure if the problem is with
grib_api or our own application, so any help is appreciated.
Our application reads many grib files simultaneously in multiple threads
and stores the values in memory. The grib files are different, no single
grib file is ever opened simultaneously for two different threads. Quite
regularly, but not always, we get "GRIB_API ERROR :
grib_handle_new_from_message: cannot create handle, no definitions
found" or "fatal flex scanner internal error--end of buffer missed" or
some other error message.
One particular stack trace:
#0 0x00000036e1479b80 in strlen () from /lib64/libc.so.6
#1 0x000000000054f20c in grib_context_strdup (c=0x8ca340, s=0x0) at
grib_context.c:599
#2 0x00000000005541bc in grib_parser_include (
fname=0x2aaabc15fd10
"/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
at grib_parse_utils.c:463
#3 0x0000000000554346 in parse (gc=0x8ca340,
filename=0x2aaabc15fd10
"/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
at grib_parse_utils.c:490
#4 0x000000000055462f in grib_parse_stream (gc=0x8ca340,
filename=0x2aaabc15fd10
"/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
at grib_parse_utils.c:507
#5 grib_parse_file (gc=0x8ca340,
filename=0x2aaabc15fd10
"/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
at grib_parse_utils.c:582
#6 0x000000000056041f in create_accessor (p=0x2aaac409bf40,
act=0x15e9e10, h=0x0) at action_class_template.c:176
#7 0x0000000000560455 in create_accessor (p=0x2aaac409bce0, act=<value
optimized out>, h=0x0) at action_class_template.c:188
#8 0x000000000055e420 in create_accessor (p=<value optimized out>,
act=0x15c3a60, h=0x0) at action_class_if.c:184
#9 0x0000000000560455 in create_accessor (p=0x2aaac4097d20, act=<value
optimized out>, h=0x0) at action_class_template.c:188
#10 0x000000000055e420 in create_accessor (p=<value optimized out>,
act=0x2aaabc0584d0, h=0x0) at action_class_if.c:184
#11 0x0000000000560455 in create_accessor (p=0x2aaac4048770, act=<value
optimized out>, h=0x0) at action_class_template.c:188
#12 0x000000000055e420 in create_accessor (p=<value optimized out>,
act=0x2aaabc04be80, h=0x0) at action_class_if.c:184
#13 0x0000000000550b07 in grib_handle_create (gl=0x2aaac4048b60,
c=<value optimized out>, data=<value optimized out>,
buflen=<value optimized out>) at grib_handle.c:210
#14 0x0000000000550e5d in grib_handle_new_from_file_no_multi
(c=0x8ca340, f=0x2aaac4048920, error=0x41e00cd4) at grib_handle.c:800
#15 grib_handle_new_from_file (c=0x8ca340, f=0x2aaac4048920,
error=0x41e00cd4) at grib_handle.c:379
#16 0x0000000000462728 in
NFmiGribGrid::DecodeGrib(std::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, float*, long*, grib_header_t*) ()
#17 0x0000000000462f41 in NFmiGribGrid::Decode(float**) ()
#18 0x0000000000463969 in NFmiGribGrid::Init(DBGribData const&) ()
#19 0x000000000044c08f in NFmiModelDB::ExtractGrid() ()
#20 0x000000000042ec4a in NFmiDB::MoveDataToPool() ()
#21 0x000000000044ec76 in NFmiModelDB::DBRead() ()
#22 0x000000000044b1f7 in NFmiModelDB::run(std::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&) ()
#23 0x000000 00005df97f in thread_proxy ()
#24 0x00000036e1c0673d in start_thread () from /lib64/libpthread.so.0
#25 0x00000036e14d40cd in clone () from /lib64/libc.so.6
I have attached a simplified test case that demonstrates the problem by
reading five grib files in five threads. This test case is tested with
Centos 5.8 x86_64 and g++ 4.4.6. Test case output:
$ ./grib_api_threaded
Using grib_api version 10916
Thread 0 reading file 0_n.grib
Thread 1 reading file 1_n.grib
GRIB_API ERROR : grib_handle_new_from_message: cannot create handle,
no definitions found
Segmentation fault
If test case is run with just one thread (set MAX_THREADS=1 and
recompile), the reading is successful. Output:
$ ./grib_api_threaded
Using grib_api version 10916
Thread 0 reading file 0_n.grib
Thread 0 reading file 1_n.grib
Thread 0 reading file 2_n.grib
Thread 0 reading file 3_n.grib
Thread 0 reading file 4_n.grib
Regards
Mikko
[Created via e-mail received from: Mikko Partio <mikko.partio@fmi.fi>]