Summary:ASTERISK-30264: res_pjsip: Subscription handlers do not get cleanly unregistered, causing crash
Reporter:N A (InterLinked)Labels:
Date Opened:2022-10-16 14:33:16Date Closed:2022-10-26 09:07:43
Status:Closed/CompleteComponents:Resources/res_pjsip_exten_state Resources/res_pjsip_mwi
Versions:20.0.0 Frequency of
Environment:Debian 10Attachments:
Description:The pubsub modules in Asterisk (...exten_state and mwi) cannot be cleanly reloaded/refreshed (unloaded and loaded).

If "module refresh res_pjsip_exten_state" or "module refresh res_pjsip_mwi" is run, Asterisk will crash due to an assertion in pjproject. The assertion is on "Make sure no module with the specified name already registered" in evsub.c.

This means that the handler is not getting cleaned up properly when the module is unloaded, so Asterisk is violating the PJSIP API.
pjsip_evsub_register_pkg is called in ast_sip_register_subscription_handler, but pjsip_evsub_unregister_pkg isn't called in ast_sip_unregister_subscription_handler, for instance.

Backtraces from all crashes of this type are similar, here is one for the MWI module arbitrarily:

Thread 1 (Thread 0x7f6370819700 (LWP 28799)):
#0  0x00007f639b67e7bb in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
       set = {__val = {0, 140065128352496, 0, 140065786033383, 4629771064781406208, 140065128352496, 140065128352597, 140065128352496, 140065128352496, 140065128352621, 140065128352796, 140065128352496, 140065128352796, 0, 0, 0}}
       pid = <optimized out>
       tid = <optimized out>
#1  0x00007f639b669535 in __GI_abort () at abort.c:79
       save_stage = 1
       act = {__sigaction_handler = {sa_handler = 0x7f6380035e10, sa_sigaction = 0x7f6380035e10}, sa_mask = {__val = {0, 303, 140065128352496, 0, 0, 0, 21474836480, 140065066024520, 140065066024368, 140065787144976, 140065787129568, 0, 3264005630345583104, 140065787114426, 0, 140065787129568}}, sa_flags = -1672162622, sa_restorer = 0x7f639c54d370}
       sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007f639b66940f in __assert_fail_base (fmt=0x7f639b7caee0 "%s%s%s:%u: %s%sAssertion `%s' failed.n%n", assertion=0x7f639c54d370 "find_pkg(event_name) == ((void *)0)", file=0x7f639c54d2c2 "../src/pjsip-simple/evsub.c", line=427, function=<optimized out>) at assert.c:92
       str = 0x7f6380035e10 "260206003200c177"
       total = 4096
#3  0x00007f639b677102 in __GI___assert_fail (assertion=0x7f639c54d370 "find_pkg(event_name) == ((void *)0)", file=0x7f639c54d2c2 "../src/pjsip-simple/evsub.c", line=427, function=0x7f639c54da00 <__PRETTY_FUNCTION__.9784> "pjsip_evsub_register_pkg") at assert.c:101
#4  0x00007f639c453652 in pjsip_evsub_register_pkg (pkg_mod=0x7f6378a71940 <pubsub_module>, event_name=0x7f6370818160, expires=3600, accept_cnt=1, accept=0x7f6370817f60) at ../src/pjsip-simple/evsub.c:427
       pkg = 0x7f6378a2310b
       i = 32611
       __PRETTY_FUNCTION__ = "pjsip_evsub_register_pkg"
#5  0x00007f6378a5d3b7 in ast_sip_register_subscription_handler (handler=0x7f6378a264a0 <mwi_handler>) at res_pjsip_pubsub.c:2843
       event = {ptr = 0x7f6378a2310b "message-summary", slen = 15}
       accept = {{ptr = 0x7f6378a230e8 "application/simple-message-summary", slen = 34}, {ptr = 0x0, slen = 0} <repeats 31 times>}
       existing = 0x0
       i = 1
       __PRETTY_FUNCTION__ = "ast_sip_register_subscription_handler"
       __FUNCTION__ = "ast_sip_register_subscription_handler"
#6  0x00007f6378a21f20 in load_module () at res_pjsip_mwi.c:1557
       mwi_container = 0x5a90b1d
       __FUNCTION__ = "load_module"
       __PRETTY_FUNCTION__ = "load_module"
#7  0x000055af05b58c10 in start_resource (mod=0x55af06713180) at loader.c:1718
       tmp = "hb321005257U000000J202201pc177000000Ȃ201pc177000000260202201pc177000000000000000000000000000000>;"231c177000000?;"231c177000000000227201pc177000000200202201pc177000000332o266005257U000000260202201pc177000000240202201pc177000000240202201pc177000000020O317005257U000000316202201pc177000000260202201p020000000000020203201pc177000000263e265005257U000000000000000000000000000000(7242xc177000000207vy5#352:323%*273065207344"$", '000' <repeats 56 times>...
       res = 21935
       __PRETTY_FUNCTION__ = "start_resource"
       __FUNCTION__ = "start_resource"
#8  0x000055af05b5929f in load_resource (resource_name=0x7f638000b8af "res_pjsip_mwi", suppress_logging=0, module_priorities=0x0, required=0, preload=0) at loader.c:1805
       mod = 0x55af06713180
       __FUNCTION__ = "load_resource"
       __PRETTY_FUNCTION__ = "load_resource"
#9  0x000055af05b593b1 in ast_load_resource (resource_name=0x7f638000b8af "res_pjsip_mwi") at loader.c:1828
       res = 0
       __PRETTY_FUNCTION__ = "ast_load_resource"
#10 0x000055af05aff432 in handle_refresh (e=0x55af05d94f80 <cli_cli+2400>, cmd=-4, a=0x7f6370818440) at cli.c:832
#11 0x000055af05b0738c in ast_cli_command_full (uid=0, gid=1004, fd=21, s=0x7f6370818720 "module refresh res_pjsip_mwi") at cli.c:3101
       args = {0x55af05d94f80 <cli_cli+2400> "020200312006257U", 0x7f638000b8a0 "module", 0x7f638000b8a7 "refresh", 0x7f638000b8af "res_pjsip_mwi", 0x0, 0x7f638000b079 "", 0x7f638000b07c "", 0x0, 0x0, 0x7f639c6d3b00 <_dl_fixup+208> "I211300d213004%030", 0x5 <error: Cannot access memory at address 0x5>, 0x0, 0x0, 0x7f639b650a98 "a=", 0x7f63708188d0 "", 0x7f639c6da4ba <_dl_runtime_resolve_xsave+138> "I211", <incomplete sequence 356>, 0x7f6370818918 "~p", 0x7f6370818918 "~p", 0x11 <error: Cannot access memory at address 0x11>, 0x1 <error: Cannot access memory at address 0x1>, 0x15 <error: Cannot access memory at address 0x15>, 0x7f6370818914 "f", 0x0, 0x0, 0x0, 0x0, 0x0, 0xffff00001fa5 <error: Cannot access memory at address 0xffff00001fa5>, 0x0 <repeats 17 times>, 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>, 0x632e726567676f6c <error: Cannot access memory at address 0x632e726567676f6c>, 0x5400475542454400 <error: Cannot access memory at address 0x5400475542454400>, 0x7f639b73feff <clone+63> "H211Ǹ<", 0x55af05c36a43 <ast_threadstorage_get+53> "H211E370H203", <incomplete sequence 370>, 0x55af0609e000 "276272355376", 0x7f638000ed10 "", 0x55af05aa0de3 <__ast_bt_get_addresses+29> "211302H213", <incomplete sequence 370211220>, 0x55af05b5f44a <__ast_rwlock_rdlock+181> "H213E340H211307350372331377377H213E340213@x203370t177YH213E340213@xH213U340H230Hi300b001", 0x55af05d31f70 <__PRETTY_FUNCTION__.17423> "ast_verb_update", 0x92170818900 <error: Cannot access memory at address 0x92170818900>, 0x55af05b5f2f8 <__ast_rwlock_unlock+794> "211Eԃ", <incomplete sequence 324>, 0x55af063631a0 "343r252005257U", 0x0, 0x0, 0x7f638000ed10 "", 0x7f6370818860 "", 0x0, 0x0, 0x2000000000000000 <error: Cannot access memory at address 0x2000000000000000>}
       e = 0x55af05d94f80 <cli_cli+2400>
       x = 3
       duplicate = 0x7f638000b8a0 "module"
       tmp = "module refresh res_pjsip_mwi000ilent000233c177000000𤘛c177000000260216201pc177000000000000000000000000000000"
       retval = 0x2 <error: Cannot access memory at address 0x2>
       a = {fd = 21, argc = 3, argv = 0x7f63708184c8, line = 0x0, word = 0x0, pos = 0, n = 0}
       __PRETTY_FUNCTION__ = "ast_cli_command_full"
#12 0x000055af05b07534 in ast_cli_command_multiple_full (uid=0, gid=1004, fd=21, size=29, s=0x7f6370818b70 "module refresh res_pjsip_mwi") at cli.c:3128
       cmd = "module refresh res_pjsip_mwi000ilent000pc177000000060u333005257U000000?;"231c177000000020355000200c177000000Ј201pc177000000346365265005257U000000000211201pc177000000274n323005257U000000002", '000' <repeats 311 times>...
       x = 28
       y = 29
       count = 0
#13 0x000055af05a87f15 in netconsole (vconsole=0x55af05dccea0 <consoles>) at asterisk.c:1464
       cmds_read = 2
       bytes_read = 29
       con = 0x55af05dccea0 <consoles>
       hostname = "voip", '000' <repeats 59 times>
       inbuf = "module refresh res_pjsip_mwi000ilent000logger mute silent", '000' <repeats 19 times>, "Cj303005257U000000H016000000000000000000340F333005257U000000>;"231c177000000020355000200c177000000060214201pc177000000Cj303005257U000000H016000000000000000000340F333005257U000000~320314005257U000000020355000200c177000000@214201pc177000000N227303005257U000000260215201pc177000000220#331005257U000000?;"231c177000000"...
       outbuf = "373IRC SASL authentication successfuln servern373Joined IRC channel #REDACTED-testn373Joined IRC channel #REDACTED-asteriskn373Joined IRC channel #REDACTEDnmay be removed in a future release.n and will be "...
       end_buf = 0x7f6370818d70 "voip"
       start_read = 0x7f6370818b70 "module refresh res_pjsip_mwi"
       res = 1
       fds = {{fd = 21, events = 1, revents = 1}, {fd = 22, events = 1, revents = 0}}
       __FUNCTION__ = "netconsole"
#14 0x000055af05c3a7a6 in dummy_start (data=0x7f637400b550) at utils.c:1574
       __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -4821410162945605947, 140065747647294, 140065747647295, 140065066030848, 0, -4821410162836554043, -1689408944167371067}, __mask_was_saved = 0}}, __pad = {0x7f6370818ed0, 0x0, 0x0, 0x0}}
       __cancel_routine = 0x55af05a85325 <ast_unregister_thread>
       __cancel_arg = 0x7f6370819700
       __not_first_call = 0
       ret = 0x0
       a = {start_routine = 0x55af05a87c78 <netconsole>, data = 0x55af05dccea0 <consoles>, name = 0x7f637400b610 "netconsole", ' ' <repeats 11 times>, "started at [ 1569] asterisk.c listener()"}
       lock_info = 0x7f638000ed10
       mutex_attr = {__size = "001000000", __align = 1}
       __PRETTY_FUNCTION__ = "dummy_start"
#15 0x00007f639bcabfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
       ret = <optimized out>
       pd = <optimized out>
       now = <optimized out>
       unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140065066030848, 4886414406491782853, 140065747647294, 140065747647295, 140065066030848, 0, -4821410162935120187, -4821314962282955067}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
       not_first_call = <optimized out>
#16 0x00007f639b73feff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Comments:By: Asterisk Team (asteriskteam) 2022-10-16 14:33:21.460-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Joshua C. Colp (jcolp) 2022-10-16 14:44:24.875-0500

PJSIP provides no capability to unregister such things. The modules should most likely be prevented from unloading.

By: N A (InterLinked) 2022-10-16 15:28:45.470-0500

Hmm, that would certainly be a temporary fix but it seems like a hack.... it's impossible to work on a pubsub module without being able to gracefully reload it... it defeats the entire point of modules existing.
I've opened a related issue here in the meantime: https://github.com/pjsip/pjproject/issues/3268

By: Joshua C. Colp (jcolp) 2022-10-16 15:33:41.035-0500

Modules exist for different purposes. To selectively load the functionality you want is one. The fact that some modules can be unloaded and then loaded again is a nice to have, and it is not commonly done in production by users.

By: Friendly Automation (friendly-automation) 2022-10-26 09:07:46.317-0500

Change 19485 merged by Friendly Automation:
res_pjsip_pubsub: Prevent removing subscriptions.


By: Friendly Automation (friendly-automation) 2022-10-26 09:08:19.531-0500

Change 19486 merged by Friendly Automation:
res_pjsip_pubsub: Prevent removing subscriptions.


By: Friendly Automation (friendly-automation) 2022-10-26 10:06:36.268-0500

Change 19484 merged by George Joseph:
res_pjsip_pubsub: Prevent removing subscriptions.


By: Friendly Automation (friendly-automation) 2022-10-26 10:06:51.602-0500

Change 19416 merged by George Joseph:
res_pjsip_pubsub: Prevent removing subscriptions.