Uploaded image for project: 'Software Support'
  1. Software Support
  2. SUP-155

Problem with grib_api and threads

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • GRIB
    • Centos 5.8 x86_64, g++ 4.4.6, grib_api 1.9.16

    • Member State Met Service

      Hi,

      we are using grib_api extensively here at FMI. According to latest
      release notes of grib_api
      (http://www.ecmwf.int/products/data/software/history/grib_api.html),
      version 1.9.16 is thread safe. This is good news for us since we are in
      the process of multithreading our application that converts gribs to our
      own internal binary data format.

      Now, the problem we are facing is that it seems like grib_api is not
      thread safe after all (or at least that's how its seems from our
      applications' perspective)! I'm not actually sure if the problem is with
      grib_api or our own application, so any help is appreciated.

      Our application reads many grib files simultaneously in multiple threads
      and stores the values in memory. The grib files are different, no single
      grib file is ever opened simultaneously for two different threads. Quite
      regularly, but not always, we get "GRIB_API ERROR :
      grib_handle_new_from_message: cannot create handle, no definitions
      found" or "fatal flex scanner internal error--end of buffer missed" or
      some other error message.

      One particular stack trace:

      #0 0x00000036e1479b80 in strlen () from /lib64/libc.so.6
      #1 0x000000000054f20c in grib_context_strdup (c=0x8ca340, s=0x0) at
      grib_context.c:599
      #2 0x00000000005541bc in grib_parser_include (
      fname=0x2aaabc15fd10
      "/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
      at grib_parse_utils.c:463
      #3 0x0000000000554346 in parse (gc=0x8ca340,
      filename=0x2aaabc15fd10
      "/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
      at grib_parse_utils.c:490
      #4 0x000000000055462f in grib_parse_stream (gc=0x8ca340,
      filename=0x2aaabc15fd10
      "/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
      at grib_parse_utils.c:507
      #5 grib_parse_file (gc=0x8ca340,
      filename=0x2aaabc15fd10
      "/cluster/home/weto/lib/grib_api_dir-1.9.16/share/definitions/common/statistics_grid.def")
      at grib_parse_utils.c:582
      #6 0x000000000056041f in create_accessor (p=0x2aaac409bf40,
      act=0x15e9e10, h=0x0) at action_class_template.c:176
      #7 0x0000000000560455 in create_accessor (p=0x2aaac409bce0, act=<value
      optimized out>, h=0x0) at action_class_template.c:188
      #8 0x000000000055e420 in create_accessor (p=<value optimized out>,
      act=0x15c3a60, h=0x0) at action_class_if.c:184
      #9 0x0000000000560455 in create_accessor (p=0x2aaac4097d20, act=<value
      optimized out>, h=0x0) at action_class_template.c:188
      #10 0x000000000055e420 in create_accessor (p=<value optimized out>,
      act=0x2aaabc0584d0, h=0x0) at action_class_if.c:184
      #11 0x0000000000560455 in create_accessor (p=0x2aaac4048770, act=<value
      optimized out>, h=0x0) at action_class_template.c:188
      #12 0x000000000055e420 in create_accessor (p=<value optimized out>,
      act=0x2aaabc04be80, h=0x0) at action_class_if.c:184
      #13 0x0000000000550b07 in grib_handle_create (gl=0x2aaac4048b60,
      c=<value optimized out>, data=<value optimized out>,
      buflen=<value optimized out>) at grib_handle.c:210
      #14 0x0000000000550e5d in grib_handle_new_from_file_no_multi
      (c=0x8ca340, f=0x2aaac4048920, error=0x41e00cd4) at grib_handle.c:800
      #15 grib_handle_new_from_file (c=0x8ca340, f=0x2aaac4048920,
      error=0x41e00cd4) at grib_handle.c:379
      #16 0x0000000000462728 in
      NFmiGribGrid::DecodeGrib(std::basic_string<char, std::char_traits<char>,
      std::allocator<char> > const&, float*, long*, grib_header_t*) ()
      #17 0x0000000000462f41 in NFmiGribGrid::Decode(float**) ()
      #18 0x0000000000463969 in NFmiGribGrid::Init(DBGribData const&) ()
      #19 0x000000000044c08f in NFmiModelDB::ExtractGrid() ()
      #20 0x000000000042ec4a in NFmiDB::MoveDataToPool() ()
      #21 0x000000000044ec76 in NFmiModelDB::DBRead() ()
      #22 0x000000000044b1f7 in NFmiModelDB::run(std::basic_string<char,
      std::char_traits<char>, std::allocator<char> > const&) ()
      #23 0x000000 00005df97f in thread_proxy ()
      #24 0x00000036e1c0673d in start_thread () from /lib64/libpthread.so.0
      #25 0x00000036e14d40cd in clone () from /lib64/libc.so.6

      I have attached a simplified test case that demonstrates the problem by
      reading five grib files in five threads. This test case is tested with
      Centos 5.8 x86_64 and g++ 4.4.6. Test case output:

      $ ./grib_api_threaded
      Using grib_api version 10916
      Thread 0 reading file 0_n.grib
      Thread 1 reading file 1_n.grib
      GRIB_API ERROR : grib_handle_new_from_message: cannot create handle,
      no definitions found
      Segmentation fault

      If test case is run with just one thread (set MAX_THREADS=1 and
      recompile), the reading is successful. Output:

      $ ./grib_api_threaded
      Using grib_api version 10916
      Thread 0 reading file 0_n.grib
      Thread 0 reading file 1_n.grib
      Thread 0 reading file 2_n.grib
      Thread 0 reading file 3_n.grib
      Thread 0 reading file 4_n.grib

      Regards

      Mikko

      [Created via e-mail received from: Mikko Partio <mikko.partio@fmi.fi>]

            usv Daniel Varela Santoalla
            admin admin
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: