Summary:ASTERISK-12762: dundi lookups occasionally stop working
Date Opened:2008-09-22 05:11:46Date Closed:2008-11-13 22:19:01.000-0600
Description:I have three Asterisk installations in different locations. When an extension is dialed and it is not in local registrations, DUNDi is used to query other boxes for that extension. Everything works fine but after some random(?) run time, DUDNi lookup break - one of sites (different one each time) loses ability to find an extension registered on another box.

So 'dundi lookup 8201@global-pub-services' just finds nothing. I'm sure that it is not firewall issue because I can turn on 'dundi debug' on the remote end and actually see queries coming.

Restarting Asterisk (the one which should respond to this particular dundi query) solves the problem and 'dundi lookup 8201@global-pub-services' works again.

The problem is that I do not know how to reproduce the problem - it just happens from time to time.

Please advise, what kind of debug information/evidences I need to collect in order to debug the problem.
Comments:By: Dmitry Andrianov (dimas) 2008-09-22 07:24:41

Right now, I have one of boxes which fails to do dundi lookups.
I'm not sure if the symptoms are the same all the time, but at this specific time 'dundi show peers' on that "bad" host for some reason displays nothing:

voip1*CLI> dundi show peers
EID                  Host                Model      AvgTime  Status        
0 dundi peers [0 online, 0 offline, 0 unmonitored]

I believe this happens because the box was restarted while DNS server was down (so it could not resolve hostnames from dundi.conf). However I'm absolutely sure, I had situations where "dundi connectivity" was lost even without box restart.

By: Tilghman Lesher (tilghman) 2008-11-13 17:04:21.000-0600

Do you have entries of "Ignoring invalid EID entry" in your logs (NOTICE level) shortly before the DUNDi hosts go missing?  The only way that I can see that the hosts would disappear is with a reload event.

By: Dmitry Andrianov (dimas) 2008-11-13 17:20:18.000-0600

The problem did not happen since I filled the issue so I can not tell.
Which is good from the one hand (everything works) but bad from the another (the issue remains not fixed).

By: Tilghman Lesher (tilghman) 2008-11-13 22:19:01.000-0600

If the problem is nonreproducible, then there's nothing more to be done here.  Of course, as soon as I close this issue, it's bound to happen again, so please feel free to reopen this issue once that happens.  Contrarily, once I predict that it will happen again, it probably won't.  But once I predict that it won't happen again, it will.  I can go back and forth on this, ad infinitum.