IoT-based sound-level control for audio amplifiers: mosques as a case study

ABSTRACT


INTRODUCTION
People tends to use loud speakers during festivals and large gatherings, in theaters, and inside large halls or outside in the open.However, careful placement of loud speakers and settings of sound levels are necessary for pleasant feelings of the audience especially in the open where weather conditions are not stable.Unfortunately, this is not the case, where some of the audiences are exposed to loud sound levels, from more than one speaker, while some of them barely hear the sound.For this reason, automatic and dynamic set of sound levels is necessary for people hearing health and audience overall comfort.In this paper, mosques, as a case study, are studied and a solution based on the internet of things (IoT) [1] is proposed.Mosques are everywhere in the Islamic world, but they are not connected.In Islamic world, mosques are sources of, at least, five Azan firings during a day.During a day, Azan is roughly fired every four hours, where environmental conditions wittiness some variations.These variations of environmental conditions impose setting the sound level of the audio amplifier at each mosque.Mosques, in this context, are things that need to be connected for the purpose of sound level control.In today's terms, IoT is a candidate for connecting these things.IoT model is used in a wide range of applications.Baucas and Spachos [2] proposed an IoT sensing framework that can be used in sound classification model.To realize sound in human machine interface, Cho et al. [3] used an IoT model.Many researchers used some forms of IoT models in their research [4]- [9].Prayers are done five times a day in predetermined and accurate times.Time of Azan (call to prayer), although it could be announced by other means like light from a light minarets [11], it should be the same in all mosques in a region for each prayer.When Azan is fired as a sound, it plays a critical rule in Muslim world as a mean of prosocial behavior [12].Azan, which has a duration of around three minutes, should be unified to have a pleasant impact on people.Audio-amplifiers volume should be set to minimize overlap between adjacent mosques to mitigate Azan interference from different sources (mosques).The main aim of this research is proposing a system that set sound levels of audio amplifiers in a region, such that the whole region is covered and the overlapping is minimized.
The rest of this paper is organized as follows.Related work is discussed in section 2. Section 3 provides a potential way of connecting mosques in the community.Sound level measurement insight is given in section 4. In section 5, we give the details about the proposed system.A proof of concept is given in section 6. Section 7 concludes this paper.

RELATED WORK
Unifying Azan is an issue that were tackled in several regions by local authorities.Amman and Tafila, for instance, are two cities in Jordan, where Azan is broadcasted by a national radio station.Each mosque has a radio receiver tuned to the transmitter frequency and connected to the audio-amplifier.However, this solution is not reliable because it resembles a single point of failure where down time is large.This kind of unifying Azan is also found in Cairo (Egypt), Abu Dhabi (UAE).These examples suffer from the same single point of failure problem as well [13].In the work of [13], authors proposed a framework for unifying call to prayers in the Islamic world.This work builds upon their work and provide an IoT design that can be used to control the volume of audio amplifiers in the mosques.
Using IoT to deal with sounds is not new.Shah et al. [14] proposed an IoT platform to gather sounds in homes and direct this recording to a classification server, which uses machine-learning techniques to identify several kinds of sounds such as explosion, and gunshot.By this, they claimed homes would be safe.This work lacks real-time feedback for the recorded sounds.In a patent [15], an IoT device is used as a proximity sensor that depends on the time difference between a wireless transmission and a sound transmission.However, this is only valid for indoor distances and for one IoT at a time.Zhang et al. [16] used an IoT platform to analyze the sound spectrum in forests to distinguish between crown and surface fires.Their work cannot be used to determine the distance of the fire.Patil [17] used an IoT platform to decide whether any car in a city crosses a noise threshold to report it to the authorities.This work lacks the distance dimension.Shah et al. [18] used an IoT platform along with artificial intelligence (AI) to classify noises in urban areas and log them into the cloud to be used when deciding to live in a quiet place.The authors in [19], used an IoT based system along with supervised learning algorithms to monitor noise levels in a city.However, the noise source cannot be identified using their proposed platforms.
In this research, a novel procedure for automating control of audio-amplifier sound volume is proposed.Tones with several known frequencies are monitored using fast Fourier transform (FFT) algorithm.Tones are fired repeatedly, starting with the highest-level then decreases in the subsequent firings with a preset delay between them.Each level is connected to a distance, therefore upon capturing this level the distance is known.After that, mosques are ready to fire Azan in a suitable level.

MOSQUES IN THE COMMUNITY
Enough number of mosques are found in a community where each one normally covers a circular area of around 500 meters radius.Several events are held in mosques besides five prayers, e.g., religious lectures by keynote speakers.Many people visit mosques around the clock to do their prayers or attend any other event.Any public announcement inside the mosque can quickly reach most of the community.The most important attribute of a mosque is being a source of a loud sound as an output of its audio-amplifier.A mosque is called a node from now on.Figure 1 shows the proposed connection of the nodes in a community.These nodes are connected by an intranet.The main part that keeps these nodes connected is an IoT device.These devices in this configuration use message queue telemetry transport (MQTT) protocol [20] to send/receive tokens from other nodes in the area.MQTT is a lightweight [21] transporting protocol on top of TCP/IP protocols that use publish/subscribe scheme for message exchange.Connections to the outside are going through the gateway.This configuration helps in limiting traffic on the network and helps in mitigating congestion during the sound level control.

ISSN: 2088-8708 
IoT-based sound-level control for audio amplifiers: mosques as a case study (Naeem Al-Oudat) To set the output sound level of a node i, the neighboring nodes monitor i's tone level.Node i starts transmissions of its tone with a highest level, then starts to decrease this level gradually as a feedback from other nodes arrive at i [13].This method of level control is time consuming since the decrease of node i tone level needs infinite number of steps.Therefore, a fast and practical method for level measurement is described in the next section.

SOUND LEVEL MEASUREMENT
Each node has a known frequency that differs from others to avoid interference.Each node sends its tone in several levels until the suitable level is determined.This process happens approximately 20 seconds (actual time depends on the number of levels) prior to the actual event (Azan as an example).In the following, the details of this procedure, sound level set (SLSet), is provided, and the prototype is built, then the experimental results on the prototype is discussed.
This procedure starts, before the actual event.It takes around 20 seconds.SLSet depends on the availability of wireless connections between nodes.Once it is time to set the sound levels of the nodes, a trigger (by external source) starts this operation.The external trigger source could be the Azan timing board (found in all mosques).We assume that this trigger happens at the same exact time for all nodes.Neighbors of a node are the nodes, which help, determine its sound level.These neighbors are saved in the neighbor's table.Neighbor's table is established based on 2D Delaunay triangulation [6] according to physical locations of the nodes on ground.Under normal conditions (no wind, humidity-free air and no noise) a set of sound levels are connected to the distance to make calculations simple.Figure 2(a) shows an example of nodes under normal condition.The distance between different levels is equal and it can be connected to distance easily.In the figure, each level adds 50 m to the previous one, where the first level covers a circle of 50 m radius.As weather conditions (mainly wind direction and speed) changed, the distances between levels are also changed and hence the distances between nodes, from each node prospective, will be changed as shown in Figure 2(b) and (c).
Under normal conditions, a Fermat point [22] for the triangle of nodes can be calculated directly, then the distance from the Fermat point to each node is mapped to a suitable level.Let   denotes the distance between node i and Fermat point F. Let each sound level covers a circle with radius that is greater than the previous one with D meters.The sound level for node i is calculated (1).
To calculate the Fermat point, we need to find the logical distances between nodes.In this paper, we propose a system to find the distances between nodes under abnormal conditions, e.g., wind.The actual distances between nodes are around 500 m, where a sound tone needs around 1.5 seconds (speed of sound in the air is 343 m/s) to travel from one node to another.For each node to determine how far other nodes on the triangle vertices are, firing of a unique sound tone from each node is triggered at the same time.
If the sending nodes have equal logical distances and start firing tones of the same frequencies at the same time, then interference at the receiving nodes might occur.To solve this problem, nearby nodes should use different tone-frequencies.The problem of minimizing the number of used frequencies is similar to graph coloring problem [23], [24].This problem is known to be NP-complete problem; therefore, researchers have proposed greedy-algorithms to solve such problem.In our context, where there is a small number of nodes within the area under consideration, a manual assignment of tone frequencies could be done for each node.In particular, we force all nodes within the coverage area of any node (whether sharing a 2D Delaunay triangle [25] or not) to use different tone frequencies.Note that the frequencies used are in the audible spectrum up to 10 KHz.Then the distributed algorithm proceeds as follows: − Before the sound firing by enough time, all nodes are triggered by an external source (Azan timing boards in mosques) to send their highest level tone.− After a while (the time required to send and process tones), each node knows the logical distances to other nodes on the triangles vertices.− At each node, the Fermat points (for all triangles the node is part of) are calculated.Based on that, the maximum distance between the node under consideration and Fermat points is considered its sound level.
In step 1, of the algorithm each node starts to determine the distances to the neighboring nodes.This is possible by integrating the tones firing and MQTT message passing between nodes.When time is right, each node fires its tone sound in the highest level.Sound levels are limited to some maximum number of levels.Each level is mapped to a distance.The distances are equally spaced (logarithmic spaced distances are also possible) to shorten distance-tuning time.
In step 2, the time needed to find the appropriate distance depends on several factors.It depends on the number of different levels transmitted by the source, the logical distance from the receiver (connected with physical distance and speed of wind) and the processing time on the receiver.To eliminate the interference that could happen when a node is firing its tone and receiving others tones, we restrict tones firing to be at the same time for all nodes in the area.Therefore, enough time will be available for sound propagation through the air and processing at the receiving nodes.
In step 3, as the previous two steps are completed, a simple calculations of the Fermat point for each triangle (formed from current node and other two nodes) is conducted.Hence, a maximum distance from current node to the Fermat points is connected to current node's sound-level.Figure 3 shows the timings of tone firing, recording of other tones and processing at one of the nodes.

SYSTEM DESIGN
The main component of the system is the IoT-sound-level-set device.These components are connected to each other's through MQTT protocol and using the Internet via a gateway to other parts.The software (procedure that sets the sound level) is also an important part of the system.

System architecture
Figure 4 shows the IoT-device connections to various parts available inside the node and the rest of the system.A source that triggers the start of sound-level-control process is the Azan timing board.This board is found in each mosque.Audio-amplifier is the second component in this system, which is also found in each mosque.It is assumed that the volume control of the audio-amplifier can be done through a signal from the IoTdevice.Iot-device is connected to the MQTT broker via Internet connection.MQTT broker is a central server shared between all mosques.Each node is a client (publisher and subscriber) in this network architecture.

IoT-device
The device is mainly a controller that has a Wi-Fi connectivity.An ESP32 module (ESP32-WROOM-32) is used for its low price and a suitable computation power [26].A microphone (INMP441 MEMS with I2S output [27]) is connected via an I2S bus.It outputs the volume control signal to the audioamplifier.Azan timing board has an output that triggers the start of sound level set for each node.

System software
On each node, the same algorithm is running.This procedure depicted by the algorithm waits for appropriate event to start.Algorithm 1 represents the SLSet procedure, which runs about 20 seconds before Azan time.It initiates the sound level setup at each node.Each node has an array of size equal to the number of its neighbors that is initially filled with a maximum sound level.This array, by the end of the algorithm, will be filled with levels heard by the neighbor nodes.These levels represent the distances of the current node from its neighbors.
In line 4 of the algorithm, _() records the level of noise on each frequency of the neighbors.When actual sound level is captured for the neighbors, the logged level is subtracted from it to know the actual level of the captured sound.The core of this function is FFT [28], [29].Lines 7-15 are repeated number-of-levels times.The block starts by firing the tone of this node in its highest level.A delay that is long enough to allow tones from neighboring nodes to arrive at the node under consideration.The _() function records around one second length of samples for later processing.In lines 10-13, a loop over the recorded samples, taking a window at a time, starts to look for tones of the neighbors.In each iteration the array of levels are updated.In line 17, the array is published on the MQTT broker for the subscribers.In line 18, the messages (tones arrays) that a current node is subscribed to, are received from the broker once available (where each node publish its array of tones).Based on the received messages, which represent the levels of current node tones at their sides, this node starts to calculate the appropriate sound level.We encourage interested reader to refer to [13] for a complete discussion of this method.Algorithm 1: Sound-Level-Set (SLSet) procedure, which runs at each node  to find its suitable sound level

Input :
Adjacent nodes freq. = { 0 ,  The tone (8305 Hz) is transmitted five times by node ID0.Another tone with different frequency is transmitted by node ID1.The length of the tone is approximately 140 ms.This is repeated every second for a length of five seconds.Then the one-second length spectrum is analyzed using the FFT algorithm.The analysis continues, taking "200 ms"-chunks at a time.The sound threshold that we used is 20000 (=20000X3.3e-6V).Note that, each node hears its sound with a highest level (level 1).From these results, we conclude that this procedure can help in setting the sound levels of the nodes under variable weather conditions.Weather conditions (wind direction and speed) weaken or strengthen the sound heard at the receiver.Hence, brings the impression of sound source distance (near or far away) at the receiver that is not the actual physical distance.Therefore, the results are correct because it depends on the heard sound to produce the distance from the source.

CONCLUSION
In this paper, an IoT-device to capture tones of known frequencies is designed.Then a novel procedure to approximate the distance between mosques is proposed and built on this IoT-device.This system was tested in a test structure built for this purpose.Results showed that it is possible to set the sound level of mosque audio amplifiers for the sake of community comfort.For this system to be effective, the audio-amplifier in mosques should support several sound levels.Further, this solution is not immune to sound-interference, which has the same frequencies of the nodes during the sound-level-setup procedure.As a future work, we intend to integrate the proposed work with Azan streaming to bring about a complete solution for unified call to prayers.

Figure 1 .
Figure 1.Mosques layout in an area, where they are connected through intranet

Figure 2 .Figure 3 .
Figure 2. Sound levels for each node and their actual locations on ground: (a) when wind is calm (b) when there is a wind of direction indicated by the arrow, and (c) logical distances as seen by each node