Summary: | ASTERISK-29880: PJSIP queues reach limits | ||
Reporter: | Spilios Georgakopoulos (sgeorgak) | Labels: | |
Date Opened: | 2022-01-25 03:03:08.000-0600 | Date Closed: | 2022-02-03 09:37:34.000-0600 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | pjproject/pjsip |
Versions: | 18.6.0 | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | RedHat Linux | Attachments: | |
Description: | Hello,
we have built a Voicemail platform for Telecom services in a Telecom Client based on Asterisk (18.6) and Opensips (3.1.0). Below you can find some details for architecture: • SIP signaling from Core Network • 2 proxy opensips VMs for the incoming calls (load balancer to asterisk VMs) and outgoing calls • 3 asterisk VMs for handling the calls with built dialplans based on phpagi (asterisk conf files and PHP-AGI files) – We call phpagi scripts to call procedures from the DB or for some other processes. • The Data for user settings are stored in external DB (ORACLE in different VM) • The data for file storage (voicemails etc.) are stored in NFS We have (and still are) faced serious problems in the service (in queues of PJSIP processor) , when we transfer more traffic from the core system (increase of active calls/Asterisk), without this traffic being so huge in order to justify the problems in the voicemail service. The resources of Asterisk have 12 CPU cores, 32GB RAM without reaching above the 50% when the problem occurs. This causes delays in call processing and the problem is increasing until the service cannot support any call. During this problem, the active calls in Asterisk are increasing when the queues are reaching the limit. (we have not any issue until ~100 active calls/Asterisk). Have you ever faced the same problems? We need your support since it is a production issue and affects the service. Br, Spilios G. | ||
Comments: | By: Asterisk Team (asteriskteam) 2022-01-25 03:03:09.822-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur. Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/]. By: Asterisk Team (asteriskteam) 2022-01-25 03:03:10.124-0600 We appreciate the difficulties you are facing, however information request type issues would be better served in a different forum. The Asterisk community provides support over IRC, mailing lists, and forums as described at http://asterisk.org/community. The Asterisk issue tracker is used specifically to track issues concerning bugs and documentation errors. If this issue is actually a bug please use the Bug issue type instead. Please see the Asterisk Issue Guidelines [1] for instruction on the intended use of the Asterisk issue tracker. Thanks! [1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines By: Asterisk Team (asteriskteam) 2022-01-25 03:03:10.607-0600 The severity of this issue has been automatically downgraded from "Blocker" to "Major". The "Blocker" severity is reserved for issues which have been determined to block the next release of Asterisk. This severity can only be set by privileged users. If this issue is deemed to block the next release it will be updated accordingly during the triage process. By: Asterisk Team (asteriskteam) 2022-01-25 03:05:23.629-0600 This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable. By: Joshua C. Colp (jcolp) 2022-01-25 08:11:07.444-0600 We require additional debug to continue with triage of your issue. Please follow the instructions on the wiki [1] for how to collect debugging information from Asterisk. For expediency, where possible, attach the debug with a '.txt' file extension so that the debug will be usable for further analysis. Thanks! [1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information Additionally we need to see actual configuration. I should also add that this is an open source project, there is no support agreement or timeframe on when or if this would get looked into. By: Spilios Georgakopoulos (sgeorgak) 2022-01-26 04:50:02.207-0600 Hello Joshua, thank you for your anwer. We will discuss with the client and enable the requested logs in the production when problem occurs and upload them here as soon as possible. By: Spilios Georgakopoulos (sgeorgak) 2022-02-02 09:32:51.887-0600 Hello Joshua, after analaysis, we check that there is an action that takes long duration and keeps the channel of each call open for a long time. This action is the following: VERBOSE[4189812][C-000000b3] pbx.c: Executing [h@guest-access-whc:4] Set("PJSIP/opensips-xxx-000000b2", "DESTINATION=xxxxxxxx74") in new stack VERBOSE[4189812][C-000000b3] pbx.c: Executing [h@guest-access-whc:5] Set("PJSIP/opensips-xxx-000000b2", "PENDING_SMS_ID=") in new stack We also took the logs that you mentioned and a core dump for this action. This is a query in the DB as you can see from the core file. Could you please check them? By: Joshua C. Colp (jcolp) 2022-02-02 09:41:33.866-0600 There's nothing really else to check. The provided backtrace only shows a single thread, waiting on the database as you mention. The dialplan just shows execution. If the database is slow, blocked or has problems, then it can and will cause issues in Asterisk including PJSIP queues piling up or channels hanging for long periods of time if they're relying on the database. You would need to investigate those aspects and understand the characteristics of the database, and if it is being slow. If logging is enabled in res_odbc.conf then the "odbc show" CLI command can provide insight into execution times. By: Richard Mudgett (rmudgett) 2022-02-02 10:16:37.500-0600 This is not a bug. Long running h extens or hangup handlers interfere with the channel driver's hangup protocol. You must collect your data and off load any further post call processing to another process. By: Spilios Georgakopoulos (sgeorgak) 2022-02-02 10:37:23.982-0600 We enabled the logging and the "odbc show" shows that we have delay on this query. The max_connection is 1 (default) in res_odbc.conf. Will the increase of connections improve the situation? The delays for this query occurs when the Asterisk starts accepting more than 40-50 active calls. After this the problem increases reaching to 200+ active calls with service disruption. By: Joshua C. Colp (jcolp) 2022-02-02 10:42:35.691-0600 It may or may not. It's dependent on your environment. At this point this issue would be a better fit on the community forum at https://community.asterisk.org/ as it is not an underlying Asterisk issue or bug. By: Asterisk Team (asteriskteam) 2022-02-03 03:25:41.241-0600 This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable. |