summaryrefslogtreecommitdiff
path: root/res
AgeCommit message (Collapse)Author
2016-06-21res_pjsip_pubsub: Address SEGV when attempting to terminate a subscriptionGeorge Joseph
Occasionally under load we'll attempt to send a final NOTIFY on a subscription that's already been terminated and a SEGV will occur down in pjproject's evsub_destroy function. This is a result of a race condition between all the paths that can generate a notify and/or destroy the underlying pjproject evsub object: * The client can send a SUBSCRIBE with Expires: 0. * The client can send a SUBSCRIBE/refresh. * The subscription timer can expire. * An extension state can change. * An MWI event can be generated. * The pjproject transaction timer (timer_b) can expire. Normally when our pubsub_on_evsub_state is called with a terminate, we push a task to the serializer and return at which point the dialog is unlocked. This is usually not a problem because the task runs immediately and locks the dialog again. When the system is heavily loaded though, there may be a delay between the unlock and relock during which another event may occur such as the subscription timer or timer_b expiring, an extension state change, etc. These may also cause a terminate to be processed and if so, we could cause pjproject to try to destroy the evsub structure twice. There's no way for us to tell that the evsub was already destroyed and the evsub's group lock can't tolerate this and SEGVs. The remedy is twofold. * A patch has been submitted to Teluu and added to the bundled pjproject which adds add/decrement operations on evsub's group lock. * In res_pjsip_pubsub: * configure.ac and pjproject-bundled's configure.m4 were updated to check for the new evsub group lock APIs. * We now add a reference to the evsub group lock when we create the subscription and remove the reference when we clean up the subscription. This prevents evsub from being destroyed before we're done with it. * A state has been added to the subscription tree structure so termination progress can be tracked through the asyncronous tasks. * The pubsub_on_evsub_state callback has been split so it's not doing double duty. It now only handles the final cleanup of the subscription tree. pubsub_on_rx_refresh now handles both client refreshes and client terminates. It was always being called for both anyway. * The serialized_on_server_timeout task was removed since serialized_pubsub_on_rx_refresh was almost identical. * Missing state checks and ao2_cleanups were added. * Some debug levels were adjusted to make seeing only off-nominal things at level 1 and nominal or progress things at level 2+. ASTERISK-26099 #close Reported-by: Ross Beer. Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
2016-06-21Merge "fix: memory leaks, resource leaks, out of bounds and bugs"zuul
2016-06-20fix: memory leaks, resource leaks, out of bounds and bugsAlexei Gradinari
ASTERISK-26119 #close Change-Id: Iecbf7d0f360a021147344c4e83ab242fd1e7512c
2016-06-20ARI: Ensure announcer channels are destroyed.Mark Michelson
Announcer channels were not being destroyed because the stasis_app_control structure that referenced them was not being destroyed. The control structure was not being destroyed because it was not being unlinked from its container. It was not being unlinked from its container because the after bridge callback for the announcer channel was not being run. The after bridge callback was not being run because the after bridge datastore was not being removed from the channel on destruction. The channel was not being destroyed because the hangup that used to destroy the channel was now only reducing the reference count to one. The reference count of the channel was only being reduced to one because the stasis_app_control structure was holding the final reference... The control structure used to not keep a reference to the channel, so that loop described above did not happen. The solution is to manually remove the control structure from its container when the playback on a bridge is complete. ASTERISK-26083 #close Reported by Joshua Colp Change-Id: I0ddc0f64484ea0016245800b409b567dfe85cfb4
2016-06-15res_pjsip_transport_management.c: Misc cleanups to survive shutdown.Richard Mudgett
* In unload_module(), reordered destroying things to minimize the window that the global transports container could be used by other threads on shutdown. When shutting down you need to stop things in the opposite order of creation. * Put the global transports container into an AO2_GLOBAL_OBJ_STATIC to eliminate the crash potential by other threads using the container on shutdown. * Made struct monitored_transport.sip_received not use ast_atomic_fetchadd_int() since it is used as a boolean value that is only set TRUE. It was previously incremented for every received SIP message and could theoretically overflow. * In monitored_transport_state_callback(), allocated the monitored transport object without a lock since the lock was unused. * In keepalive_global_loaded(), removed releasing the transports container if the keepalive_thread could not be started. I set it up to be tried again if the user reloads the configuration. Change-Id: I8d12d16ef564290fa6d25a32334bb5ce8fdf87ff
2016-06-14res_pjsip.c: Add check that timer actually got scheduled.Richard Mudgett
Change-Id: Iabaa2e5dccf0762c258101ea0eb1487cf6959ad1
2016-06-14Merge "res_pjsip_session.c: Reorganize ast_sip_session_terminate()."zuul
2016-06-13res_rtp_multicast.c: Fix warning message typo.Richard Mudgett
Change-Id: Ic9928208b9957e09866abe3d9649030942ec52b3
2016-06-10res_pjsip_session.c: Reorganize ast_sip_session_terminate().Richard Mudgett
Change-Id: I68a2128bcba4830985d2d441e70dfd1ac5bd712b
2016-06-09Merge "ARI: Ensure proper channel state on operations."zuul
2016-06-09ARI: Ensure proper channel state on operations.Mark Michelson
ARI was recently outfitted with operations to create and dial channels. This leads to the ability to try funny stuff. You could create a channel and then immediately try to play back media on it. You could create a channel, dial it, and while it is ringing attempt to make it continue in the dialplan. This commit attempts to fix this by adding a channel state check to operations that should not be able to operate on outbound channels that have not yet answered. If a channel is in an invalid state, we will send a 412 response. ASTERISK-26047 #close Reported by Mark Michelson Change-Id: I2ca51bf9ef2b44a1dc5a73f2d2de35c62c37dfd8
2016-06-09res_pjsip_registrar.c: Eliminate rx REGISTER request race condition.Richard Mudgett
This patch fixes a race condition processing received REGISTER requests and their retransmissions caused by REGISTER requests being processed by two threads. The "sip_transaction Unable to register REGISTER transaction (key exists)" message is a notable symptom of this issue. This issue was more likely to happen before the pjsip/distributor serializers were created. Instead of steps one and two below placing the REGISTER messages into the same pjsip/distributor they were placed in random pjsip/default serializers. 1) REGISTER requests come in and get placed on the pjsip/distributor serializer. 2) Before the first request is processed a retransmission comes in and is placed on the same pjsip/distributor serializer. 3) The first request goes up the pjsip stack and is then shunted off to the pjsip/aor/<aor> serializer. 4) Before the first request is completed processing in the pjsip/aor/<aor> serializer, the second request goes up the pjsip stack and is also shunted off to the pjsip/aor/<aor> serializer. 5) The first request completes processing and sends out its response. 6) The second request completes processing and tries to send out its response but pjlib complains that the REGISTER transaction key already exists. 7) Sadness ensues. * The race is eliminated by removing the pjsip/aor/<aor> serializer and continuing the processing in the pjsip/distributor serializer. Now any retransmissions queued in the pjsip/distributor serializer will be processed after the first message is completely processed. ASTERISK-26088 #close Reported by: Richard Mudgett Change-Id: I842d714346088bf717ea27437f1dd85bff0bab5a
2016-06-09sorcery: Add setting object type congestion levels.Richard Mudgett
Sorcery creates taskprocessors for object types to process object observer callbacks. An API call is needed to be able to set the congestion levels of these taskprocessors for selected object types. * Updated PJSIP's contact and contact_status sorcery object type observer default congestion levels based upon stress testing. Increased the congestion levels to reduce the potential for bursty register/unregister and subscribe/unsubscribe activity from triggering the taskprocessor overload alert. ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I4542e83b556f0714009bfeff89505c801f1218c6
2016-06-09taskprocessors: Implement high/low water mark alerts.Richard Mudgett
When taskprocessors get backed up, there is a good chance that we are being overloaded and need to defer adding new work to the system. * Implemented a high/low water alert mechanism for modules to check if the system is being overloaded and take appropriate action. When a taskprocessor is created it has default congestion levels set. A taskprocessor can later have those congestion levels altered for specific needs if stress testing shows that the taskprocessor is a symptom of overloading or needs to handle bursty activity without triggering an overload alert. * Add CLI "core show taskprocessor" low/high water columns. * Fixed __allocate_taskprocessor() to not use RAII_VAR(). RAII_VAR() was never a good thing to use when creating a taskprocessor because of the nature of how its references needed to be cleaned up on a partial creation. * Made res_pjsip's distributor check if the taskprocessor overload alert is active before placing a message representing brand new work onto a distributor serializer. ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I182f1be603529cd665958661c4c05ff9901825fa
2016-06-09res_pjsip_session: Use distributor serializer for incoming calls.Richard Mudgett
We must continue using the serializer that the original INVITE came in on for the dialog. There may be retransmissions already enqueued in the original serializer that can result in reentrancy and message sequencing problems. Outgoing call legs create the pjsip/outsess/<endpoint> serializers for their dialogs. ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I24d7948749c582b8045d5389ba3f6588508adbbc
2016-06-09res_pjsip_pubsub.c: Recreate subscriptions using distributor serializer.Richard Mudgett
* Resolves potential reentrancy problems if system restarted in the middle of subscription message transactions. * Fixes memory leak recreating persistent subscriptions when the subscription resource tree could not be created. ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I71e34d7ae8ed35a694f1030e820e2548c48697be
2016-06-09res_pjsip_pubsub.c: Use distributor serializer for incoming subscriptions.Richard Mudgett
We must continue using the serializer that the original SUBSCRIBE came in on for the dialog. There may be retransmissions already enqueued in the original serializer that can result in reentrancy and message sequencing problems. The "sip_transaction Unable to register SUBSCRIBE transaction (key exists)" message is a notable symptom of this issue. Outgoing subscriptions still create the pjsip/pubsub/<endpoint> serializers for their dialogs. ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I18b00bb74a56747b2c8c29543a82440b110bf0b0
2016-06-09pjsip_distributor.c: Consistently pick a serializer for messages.Richard Mudgett
Incoming messages that are not part of a dialog or a recognized response to one of our requests need to be sent to a consistent serializer. Under load we may be queueing retransmissions before we can process the original message. We don't need to throw these messages onto random serializers and cause reentrancy and message sequencing problems. * Created a pool of pjsip/distributor serializers that get picked by hashing the call-id and remote tag strings of the received messages. * Made ast_sip_destroy_distributor() destroy items in the reverse order of creation. ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I2ce769389fc060d9f379977f559026fbcb632407
2016-06-09pjsip_distributor.c: Ignore messages until fully booted.Richard Mudgett
We should not be processing any incoming messages until we are fully booted. We may not have dialplan or other needed configuration loaded yet. ASTERISK-26089 #close Reported by: Scott Griepentrog ASTERISK-26088 Reported by: Richard Mudgett Change-Id: I584aefb4f34b885a8927e1f13a2c64babd606264
2016-06-09Merge "Fixes to include signal.h"Joshua Colp
2016-06-09Merge "Make use of GLOB_BRACE and GLOB_NOMAGIC optional"Joshua Colp
2016-06-08Fixes to include signal.hTimo Teräs
POSIX defines signal.h. sys/signal.h should not be used as it is c-library internal header which may or may not exist. Notably with musl it generates warning of being incorrect. Change-Id: Ia56b0aa1d84b5c590114867b1b384a624f39a6fc
2016-06-08res_hep_{pjsip|rtcp}: Decline module loads if res_hep had not loadedMatt Jordan
A crash can occur in res_hep_pjsip or res_hep_rtcp if res_hep has not loaded and does not have a configuration file. Previously when this occurred, checks were put in to see if the configuration was loaded successfully. While this is a good idea - and has been added to the offending function in res_hep - the reality is res_hep_pjsip and res_hep_rtcp have no business running if res_hep isn't also running. As such, this patch also adds a function to res_hep that returns whether or not it successfully loaded. Oddly enough, ast_module_check returns "everything is peachy" even if a module declined its load - so it cannot be solely relied on. res_hep_pjsip and res_hep_rtcp now also check this function to see if they should continue to load; if it fails, they decline their load as well. ASTERISK-26096 #close Change-Id: I007e535fcc2e51c2ca48534f48c5fc2ac38935ea
2016-06-08Merge "ari/resource_channels: Add 'formats' to channel create/originate"Joshua Colp
2016-06-07Merge "res_odbc: Implement a connection pool."Joshua Colp
2016-06-07res_odbc: Implement a connection pool.Joshua Colp
Testing has shown that our usage of UnixODBC is problematic due to bugs within UnixODBC itself as well as the heavy weight cost of connecting and disconnecting database connections, even when pooling is enabled. For users of UnixODBC 2.3.1 and earlier crashes would occur due to insufficient protection of the disconnect operation. This was fixed in UnixODBC 2.3.2 and above. For users of UnixODBC 2.3.3 and higher a slow-down would occur under heavy database use due to repeated connection establishment. A regression is present where on each connection the database configuration is cached again, with the cache growing out of control. The connection pool implementation present in this change helps to mitigate these issues by reducing how much we connect and disconnect database connections. We also solve the issue of crashes under UnixODBC 2.3.1 by defaulting the maximum number of connections to 1, returning us to the previous working behavior. For users who may have a fixed version the maximum concurrent connection limit can be increased helping with performance. The connection pool works by keeping a list of active connections. If the connection limit has not been reached a new connection is established. If the connection limit has been reached then the request waits until a connection becomes available before continuing. ASTERISK-26074 #close ASTERISK-26054 #close Change-Id: I6774bf4bac49a0b30242c76a09c403d2e856ecff
2016-06-07res_srtp: Instead of libSRTP use OpenSSL as random source.Alexander Traud
Since libSRTP 1.5, its Random Number Generator (RNG) is not maintained anymore. Therefore, the symbol RAND_bytes is used instead of crypto_get_random. ASTERISK-24436 #close Change-Id: Iea0bae4d4e3c9aa0926ea442b6484b5159789d96
2016-06-03ari/resource_channels: Add 'formats' to channel create/originateGeorge Joseph
If you create a local channel and don't specify an originator channel to take capabilities from, we automatically add all audio formats to the new channel's capabilities. When we try to make the channel compatible with another, the "best format" functions pick the best format available, which in this case will be slin192. While this is great for preserving quality, it's the worst for performance and overkill for the vast majority of applications. In the absense of any other information, adding all formats is the correct thing to do and it's not always possible to supply an originator so a new parameter 'formats' has been added to the channel create/originate functions. It's just a comma separated list of formats to make availalble for the channel. Example: "ulaw,slin,slin16". 'formats' and 'originator' are mutually exclusive. To facilitate determination of format names, the format name has been added to "core show codecs". ASTERISK-26070 #close Change-Id: I091b23ecd41c1b4128d85028209772ee139f604b
2016-06-03Make use of GLOB_BRACE and GLOB_NOMAGIC optionalTimo Teräs
These flags are non-portable GNU extensions. Make their use optional. This fixes complication error on e.g. musl c-library based systems. Change-Id: I0aa06efc62aa8995f091445c8b762a75a91042f3
2016-05-31pjsip_distributor.c: Use correct rdata info access method (Part 2).Richard Mudgett
The pjproject doxygen for rdata->msg_info.info says to call pjsip_rx_data_get_info() instead of accessing the struct member directly. You need to call the function mostly because the function will generate the struct member value if it is not already setup. Change-Id: I4d519385a577f3e9d9193a88125e493cf17fa799
2016-05-31Merge "ARI: Re-implement the ARI dial command, allowing for early bridging."zuul
2016-05-31Merge "res_pjsip_mwi_body_generator: Re-order the body items"zuul
2016-05-31Merge "res_pjsip: add "via_addr", "via_port", "call_id" to contact"Joshua Colp
2016-05-31Merge "res_pjsip: Add clarifying documentation to PJSIP_HEADER help text"zuul
2016-05-31Merge "multicast RTP: Add dialing options"zuul
2016-05-31Merge "res_pjsip: chatty verbose messages"zuul
2016-05-30res_pjsip_mwi_body_generator: Re-order the body itemsGeorge Joseph
Re-ordered the body items so Message-Account is second. Messages-Waiting: no Message-Account: sip:1571@<IP Removed>:5060 Voice-Message: 0/0 (0/0) ASTERISK-26065 #close Reported-by: Ross Beer Change-Id: If5d35a64656eac98c2dd5e490cc0b2807bed80c3
2016-05-27res_pjsip: Add clarifying documentation to PJSIP_HEADER help textRusty Newton
Added notes about when you can read or write headers. Specifically about being able to read on the inbound channel and write on an outbound channel. ASTERISK-26063 #close Reported by: Private Name Tested by: Rusty Newton Change-Id: Ibeb64af17d1f6451028b3c29855a3f151a01d8c5
2016-05-27multicast RTP: Add dialing optionsMark Michelson
This adds a new parameter to the end of a multicast RTP dialing string. This parameter defines the following options: * i: Set the interface from which multicast RTP is sent * l: Set whether multicast packets are looped back to the sender * t: Set the TTL for multicast packets * c: Set the codec to use for RTP ASTERISK-26068 #close Reported by Mark Michelson Change-Id: I033b706b533f0aa635c342eb738e0bcefa07e219
2016-05-27ARI: Re-implement the ARI dial command, allowing for early bridging.Mark Michelson
ARI dial had been implemented using the Dial API. This made great sense when dialing was 100% separate from bridging. However, if a channel were to be added to a bridge during the dial attempt, there would be a conflict between the dialing thread and the bridging thread. Each would be attempting to read frames from the dialed channel and act on them. The initial attempt to make the two play nice was to have the Dial API suspend the channel in the bridge and stay in charge of the channel until the dial was complete. The problem with this was that it was riddled with potential race conditions. It also was not well-suited for the case where the channel changed which bridge it was in during the dial. This new approach removes the use of the Dial API altogether. Instead, the channel we are dialing is placed into an invisible ARI dialing bridge. The bridge channel thread handles incoming frames from the channel. If the channel is added to a real bridge, it is departed from the invisible bridge and then added to the real bridge. Similarly, if the channel is removed from the real bridge, it is automatically added back to the invisible bridge if the dial attempt is still active. This approach keeps the threading simple by always having the channel being handled by bridge channel threads. ASTERISK-25925 Change-Id: I7750359ddf45fcd45eaec749c5b3822de4a8ddbb
2016-05-26res_pjsip: add "via_addr", "via_port", "call_id" to contactAlexei Gradinari
As res_pjsip_nat rewrites contact's address, only the last Via header can contain the source address of registered endpoint. Also Call-Id header may contain the source address of registered endpoint. Added "via_addr", "via_port", "call_id" to contact. Added new fields ViaAddress, CallID to AMI event ContactStatus. ASTERISK-26011 Change-Id: I36bcc0bf422b3e0623680152d80486aeafe4c576
2016-05-26res_pjsip: chatty verbose messagesAlexei Gradinari
There are a lot of verbose messages about Endpoint and Contact status changes if there are many dynamic endpoints. The patch sets verbose level 2 for Endpoint status changes and verbose level 3 for Contact status changes. ASTERISK-26055 #close Change-Id: Ie64e261ddbbc41bfff0f0190241152cc123fe6d7
2016-05-26pjsip_distributor.c: Use correct rdata info access method.Richard Mudgett
The pjproject doxygen for rdata->msg_info.info says to call pjsip_rx_data_get_info() instead of accessing the struct member directly. You need to call the function mostly because the function will generate the struct member value if it is not already setup. Change-Id: Iafe8b01242b7deb0ebfdc36685e21374a43936d2
2016-05-25Merge "res_pjsip_outbound_publish: Ensure publish is valid when explicitly ↵zuul
destroying."
2016-05-24Merge "res_pjsip: Only check transaction on transaction state events."zuul
2016-05-24res_pjsip_outbound_publish: Ensure publish is valid when explicitly destroying.Joshua Colp
Recent changes to res_pjsip_outbound_publish have introduced a race condition at shutdown where an outbound publish may be shutdown twice. In this case the first succeeds as a result of the unpublish. In the second invocation since it's been unpublished a task is queued to just destroy the client. This task holds no ref to the publish and as a result the publish may be destroyed before the task is run, causing a crash. This explicit destruction task now holds a reference to the publish to ensure it remains valid. ASTERISK-26053 #close Change-Id: I10789b98add3e50292ee3b33a55a1d9061cec94b
2016-05-23Merge "ARI: Add the ability to download the media associated with a stored ↵Joshua Colp
recording"
2016-05-22res_pjsip: Only check transaction on transaction state events.Joshua Colp
The send request callback function currently assumes that it will only ever be called on transaction state changes. This is not always true. If our own timer callback occurs we will call the callback with a timer event instead of a transaction state change event. In this case the transaction on the event is invalid and accessing it will result in a crash. ASTERISK-26049 #close Change-Id: I623211c8533eb73056b0250b4580b49ad4174dfc
2016-05-20res_pjsip: Match dialogs on responses better.Mark Michelson
When receiving an incoming response to a dialog-starting INVITE, we were not matching the response to the INVITE dialog. Since we had not recorded the to-tag to the dialog structure, the PJSIP-provided method to find the dialog did not match. Most of the time, this was not a problem, because there is a fall-back that makes the response get routed to the same serializer that the request was sent on. However, in cases where an asynchronous DNS lookup occurs in the PJSIP core, the thread that sends the INVITE is not actually a threadpool serializer thread. This means we are unable to record a serializer to handle the incoming response. Now, imagine what happens when an INVITE is sent on a non-serialized thread, and an error response (such as a 486) arrives. The 486 ends up getting put on some random threadpool thread. Eventually, a hangup task gets queued on the INVITE dialog serializer. Since the 486 is being handled on a different thread, the hangup task can execute at the same time that the 486 is being handled. The hangup task assumes that it is the sole owner of the INVITE session and channel, so it ends up potentially freeing the channel and NULLing the session's channel pointer. The thread handling the 486 can crash as a result. This change has the incoming response match the INVITE transaction, and then get the dialog from that transaction. It's the same method we had been using for matching incoming CANCEL requests. By doing this, we get the INVITE dialog and can ensure that the 486 response ends up being handled by the same thread as the hangup, ensuring that the hangup runs after the 486 has been completely handled. ASTERISK-25941 #close Reported by Javier Riveros Change-Id: I0d4cc5d07e2a8d03e9db704d34bdef2ba60794a0
2016-05-20ARI: Add the ability to download the media associated with a stored recordingMatt Jordan
This patch adds a new feature to ARI that allows a client to download the media associated with a stored recording. The new route is /recordings/stored/{name}/file, and transmits the underlying binary file using Asterisk's HTTP server's underlying file transfer facilities. Because this REST route returns non-JSON, a few small enhancements had to be made to the Python Swagger generation code, as well as the mustache templates that generate the ARI bindings. ASTERISK-26042 #close Change-Id: I49ec5c4afdec30bb665d9c977ab423b5387e0181