Sustainable governance in smart cities and use of supervised learning based opinion mining

Received May 7, 2020 Revised Jun 30, 2020 Accepted Aug 4, 2020 Evaluation is an analytical and organized process to figure out the present positive influences, favourable future prospects, existing shortcomings and ulterior complexities of any plan, program, practice or a policy. Evaluation of policy is an essential and vital process required to measure the performance or progression of the scheme. The main purpose of policy evaluation is to empower various stakeholders and enhance their socio-economic environment. A large number of policies or schemes in different areas are launched by government in view of citizen welfare. Although, the governmental policies intend to better shape up the life quality of people but may also impact their every day’s life. A latest governmental scheme Saubhagya launched by Indian government in 2017 has been selected for evaluation by applying opinion mining techniques. The data set of public opinion associated with this scheme has been captured by Twitter. The primary intent is to offer opinion mining as a smart city technology that harness the user-generated big data and analyse it to offer a sustainable governance model.


INTRODUCTION
Policy is a set of laws and principles, collection of rules and regulations, an art of code conduct manifested by any authentic governing body. They are designed to enrich the decision making process resulting in favourable outcomes for the betterment of any society or community. The progress curve of a policy represents the gap between actual and expected end results which can be determined by policy evaluation. Indeed, the evaluation of a policy is a critical process and reflects the public response which may varies from people to people, community or society. Therefore, there is pressing need for consuming individual's opinion [1] in the evaluation process which leads the government to finalize the future actions for a policy.
Moreover, the smart city-smart nation initiatives in recent years by various government agencies across the globe aim at designing smart city technologies (SCTs) which enhance the cities' smartness and improves sustainability. Social web [2] plays a vital role in order to support social interactions among people. It facilitates the public intercommunication about any governmental policy with a view to get their feedback by the use of various social networking sites and channels. Processing of such a huge amount of data generated over such media tends to better analyse the decision making process. The processing is accomplished with the aid of opinion mining for the sake of extracting public opinion from this big data in the direction of good policy making. One of the recent governmental schemes' "Pradhan Mantri Sahaj Bijli Har Ghar Yojana, or Saubhagya" [3] has been contemplated in this work for incorporating the techniques of opinion mining for the sake of improving the process of policy evaluation. The scheme has been launched ("Saubhagya scheme: All you need to know", 2017 [3]) for the electrification of households in rural and urban areas of India by December 2018 [4].
This paper utilizes one of the famous social networking sites, Twitter [5], as a source for dataset collection due to following reasons: -Association with volume and variance of people -Collection of numerous diversified topics -Providing a great opportunity for researchers to exploit their interests -Placing and sharing of thoughts at a large scale to get a common conclusion for a specific subject.
-Most widely used tool by government to approach people -Act as an early warning system for governmental bodies due to its quick nature -Only forum which worldwide share opinion or feedback against governmental exercises Data has been collected over a period of two months after the implementation of scheme and is further divided into 4 phases as listed in Table 1. This classification of data collection period intends to provide an approximate coverage of public perception regarding the scheme [4]. This work proffers an opinion mining based sustainable governance model, SmartElectricGov, as a smart city technology (SCT) model for smart city-smart nation initiative. It demonstrates the smart city notion of engaging more effectively and actively with its citizens for government policy tracking, primarily its evaluation. It also explains the concept of opinion mining and presents how opinion mining has been introduced for the optimization of policy evaluation process. Section 3 performs the evaluation of Saubhagya scheme (a government initiative to provide electricity to every house within the nation) using opinion mining techniques which comprises of data acquisition, analytical and conceptual representation of data and apply opinion mining process to determine the polarity of public opinion with their tweets. Section 4 empirically analyses the results and findings of all the different machine learning algorithms in terms of efficacy measures along with their graphical representations. Lastly, Section 5 concludes the paper.

SMART CITIES, POLICY EVALUATION AND OPINION MINING
The government agencies are taking steps such that the smart city initiatives benefit everyone-its residents, business persons and the government. A policy is a powerful tool configured in response to address the social challenges and to accomplish the basic essentials of a common man. It helps government or organization in improving the process of decision making and resource optimization ("Policy", 2018) [4]. They are devised to assure a favourable outcome which lifts the socio-economic level of a community or society. If the government policies are favourable, many local & global companies will invest in the future smart cities projects. The process of policy making is a never ending effort performed with a view to formulate the requirements of people into governmental course of action. Figure 1 represents the abstract and generic view of a policy life cycle. The cycle segregates the process of policy into a sequence of phases: policy planning, policy analysis & development, policy monitoring and policy evaluation. a. Policy Planning: Being the first phase of the cycle, the main focus is to identify the problem.
These problems may arise either from existing complexities in the system or to achieve the future objectives for the welfare of society. Various controversial issues or disputed points which seek an immediate attention from government or organization and foster them to provide a feasible solution are recognized and recorded for the agenda. The nature of each and every problem is then described in order to get deeper and better understand ability of the situation. b. Policy Analysis & Development: An organized and complete interpretation of identified problem is another critical step in the policy life cycle model. It is the process of decomposing the entangled issues into smaller and atomic components in order to better understand the correlation between them which in turn adepts the process of problem solving and decision making. Technocracy, bureaucracy and democracy are the three essential parameters kept under consideration while analysing a policy. It assesses the political and technical complications and provide valuable information to decision makers or policy analysts about the augmentation of a policy and its effect on socio-economic, political, technical and legal factors [6][7][8]. Policy analysis is followed by the process of policy development which includes its formulation and plan of implementation. This requires effective writing skills in order to draft the policy, its mode of communication to different stakeholders and to get it published on a universal forum. c. Policy Monitoring: The prime focus of this phase is on the continuous assessment of implementation practices carried out for policy development [9]. It helps to recognise the promising gaps in the execution of policies and traces the blueprint for their enrichment. In-house strategies and procedures are developed to ensure the tracking and reporting about the policy progress. The process of monitoring should be periodic in nature that means, performed at regular intervals of time. d. Policy Evaluation: Policy evaluation is one of the crucial phase in the life span of a policy. It analyses various internal and external factors which measure the policy performance based on people administration in all the potential aspects. It helps in determining whether the policy outcome is the as per the requisite expectations or for the purpose for which it is intended to be [10][11][12]. Both the qualitative and quantitative measures are taken into consideration for the resource optimization of technical, social, financial, ethical and other aspects with a view to provide an overall rating of a policy success or failure.

Opinion mining
The nomenclature of the term opinion mining is comprised of two terms opinion and mining that implies mining of opinion. Thus, opinion mining [13][14][15] represents a symbolic meaning for extraction of public sentiments over various subject matters of diversified domains. It is a process of capturing and categorizing the perception of public about a particular product, project, phenomenon or a proposal. It plays a vital role with a view to ensure citizen participation (a) for the progress of public activities which are directly or indirectly impacting their quality of life and (b) to receive the feedback regarding any product, topic or event for improvising their future course of action. [1,[13][14][15][16] represents the three dimensions (tasks, techniques and applications) of opinion mining. The specialization of all three dimensions illustrates that all of them are interrelated and can be used in any combination to achieve opinion mining. For example, out of various available techniques {machine learning (ML), lexicon based (LB), hybrid techniques (HT), ontology based (OB), context based (CB)} any of them could be used to explore opinion/sentiments by performing various tasks (subjectivity classification, sentiment classification, review measures, spam detection, lexicon formation, feature selection and so on) for any available/probable application area (Government, Market, Business, Smart Society Services, Information Security & Analysis, Sub Component Technology etc). To accomplish the goal of this paper, coupling of two dimensions i.e. techniques and applications of opinion mining has been done as indexed in Table 2. It utilizes machine learning based approaches for opinion mining in the implementation sphere of governmental practices with the specialization of supervised machine learning techniques in policy evaluation [17][18][19][20].

Smart cities, policy evaluation and opinion mining
Cities in country like India are governed by multiple organizations and authorities. The different spatial entities with multiple boundaries deter effective planning and governance. Thus, to realize the smart city-smart nation mission, technology which entails & captures effective citizen participation is desired. "People-centric" strategic technology components are imperative to create smart outcomes for citizens. Policy evaluation is an effectual and empirical investigation process to appraise the performance of a policy by measuring its work practices in a real time environment. This process compares the progress curve of a policy over time by answering various questions such as examining the accomplishment of goals, checking the quality, deciding future prospects and measures of agenda, implementing strategy with their expected output and so on. It is a complex process in view of its association with numerous real time constraints such as cost, time, effort, social and legal consideration, resource optimisation and much more. A policy may be evaluated both in internal as well as external environment. Internal evaluation includes different activities such as execution of pre-set procedures and methodologies for estimating the progress, designing a team of well trained and professionally qualified public officials, provision of appropriate logistics (policy objectives, completion time line, requisite resources etc.) to evaluators whereas external evaluation requires cooperation and participation of the target audience in this critical exercise for rating the policy pace. Thus, consulting the public opinion comes out to be an essential and fruitful practice in the process of policy evaluation which fosters the need of inducting opinion mining in policy evaluation for a comprehensive development of a country attracting people & investments.

SMART ELECTRIGOV: SAUBHAGYA YOJNA EVALUATION USING OPINION MINING
Saubhagya, or Pradhan Mantri Sahaj Bijli Har Ghar Yojana, is a socio centric service launched on September 25, 2017 ("Saubhagya scheme: All you need to know", 2017 [3,4,12) to make the dream of "electricity for all" a reality. This government scheme aims at nationwide household electrification for people welfare by making their life quality better. The government visions the scheme as extremely beneficial in the interest of citizen considering its positive implications over various sectors like education, health, agriculture, entrepreneurship etc.
Undoubtedly, Saubhagya is a stairs of success to bring prominent future opportunities for India. Nevertheless, the evaluation of this policy will reflect the exact statistics of public reaction and their inclination about the scheme. Hence, this paper induces the techniques of opinion mining in policy evaluation with a view to optimise the process exemplified over Saubhagya. The entire system schema is split up into four modules as represented in Table 3. The execution of modules is carried out in the following order: Data Acquisition, Pre-processing of data, Classification by machine learning techniques and Evaluation by standard efficacy measures which are discussed in subsequent sections.

Acquisition of data and its pre-processing
The data has been collected by means of a social networking site, Twitter. It offers a platform to share views over a particular subject matter by several users worldwide with different cultural, educational and social backgrounds. To that end, it provides a global outlook of public sentiments and their inclination towards any matter of concern, topic, event or phenomena [21][22][23][24]. Twitter search API (application package interface) has been used for the extraction of Saubhagya scheme tweets. Following are the steps for data acquisition: -An application utilising the gem "twitter" is developed in Ruby on Rails. Tweets are extracted from twitter search API by using a ruby interface called twitter ruby gem. -Registration of the developed application is done by receiving the access token i.e. o auth credentials for code integration. -Some of the hash tags used in this process are: #SaubhagyaScheme, #SaubhagyaYojna, #Saubhagya, #SaubhagyaPlan. -Finally, the messages consist of particular keywords returned by Twitter APIs are collected.
-Total count of tweets collected during the timeline of four weeks after the launch of scheme is 1262 and their daily status count. Pre-processing of the gathered tweets is performed for clean, consistent and classified data. Further, different for the classification of normalized tweets we used different machine language. Steps involved in the procedure of pre-processing are: -There is huge possibility of duplicate tweets in the course of data collection, so firstly in data pre-processing we need to remove the duplicate tweets. -There are many special characters $,#,* etc, and number so we need to discard numbers and special characters s -Remove all URL links and words like is, am are, the etc.

Opinion classification based on machine learning techniques
In this paper, we empirically analyse various standard machine learning algorithms listed in Table 4. And precision, recall and accuracy [13] listed in Table 5. Table 4. Machine learning techniques [13] Name of Technique(s) Description Naive Bayesian Part of a class of simple probabilistic classifiers used for the estimation of classification parameters.

Support Vector Machine
Belongs to the class of discriminative supervised learning based classifier for identification of classification pattern. It classifies the data by a hyperplane.

Multilayer Perceptron
Represents a network of neurons named perceptron related to artificial neural network.

k-Nearest Neighbours
Fundamental machine learning algorithm that does not make any underlying assumptions about data distribution. Here, classification of objects is based on the voting of neighbours and the class assigned to the object is usually among its k nearest neighbours [21] Decision Tree Tree model of decisions and their results used as a decision support tool. Developed from top to bottom with a single root node at the top and branching of several leaf nodes with probable outcomes.

Random Forest
Advancement of decision trees to get more specific results.

Linear Regression
Linear approach used for predictive analysis to identify whether predictor variables are able to predict dependent variables and which variable are significant predictors of dependent variables.

K-star
Instance based classifier determines similarity functions by using entropy based distance function Bagging It improves classification by combining random generated training sets.

Adaboost
Adaptive boosting, machine learning meta algorithm. A strong classifier has been developed by combining various weak classifiers to improve performance.

Policy evaluation based on opinion mining
Sentiment realization and their classification in collected tweet is done by applying opinion mining techniques [14]. The concept of incorporating thoughts or viewpoint of stakeholders and end users in policy evaluation give rise to showcase the precise portrayal of policy performance. As a step forward to this, opinion associated with the collected tweets of Saubhagya scheme has been mined and categorized in three  Table 6. The status of positive, negative and neutral tweets of Saubhagya Yojna across phases is depicted in Figure 2. The canvas represents an apparent positive deviation of people towards the scheme with 90% of positive and 10% of neutral tweets during phase one. No individual figure of negative tweet has been confronted in this phase out of 281 collected tweets. The facts and records reflects a contrasting reaction of citizen in second phase, although the stats represent 63% of positive tweets, 9.6% of negative tweets and 26.6% of neutral tweets still resulting in maximal count of positive tweets. Phase three recorded an identical phenomenon of earlier phase with a highest percentage of positive tweets that is 78% followed by 16% of neutral and 6% of negative tweets. An unexpected rise in percentage of neutral tweets with 71% is observed in phase four contrary to positive tweets with 17% and negative tweets with 12%. The abrupt fluctuation in the figures of neutral vs. positive tweets is on account of maximum participation of public in expressing their thoughts in the initial phases of launch of scheme. However, a large number of informational tweets has been posted by media or civic in last phase concerning the most recent measures finalized for the policy execution.

RESULTS AND FINDINGS
Over a count of 1262 tweets collected from Twitter, a set of ten supervised machine learning algorithms have been applied to find the polarity of opinion for this scheme [25]. SVM, kNN, MLP, RF, LR, DT, K-star, Bagging, NB and Adaboost are the approaches used for the calculation of standard evaluation measures and the respective experimental results for accuracy, precision and recall are indexed in Table 7. Comparative analysis of supervised machine learning algorithms has been carried out to determine the best classifier for categorizing sentiments as direct quantifiers of policy. Following observations have been listed in view of above results: -The highest accuracy has been achieved by Support Vector Machine with 91.77% and Adaboost being the least with 72.41%. -kNN and MLP reflects encouraging results, both with 91.47% accuracy but kNN is being better in terms of precision with 91.6% and MLP with 91%. -Next level has been achieved by Random Forest with an accuracy of 91.07% followed by Linear Regression with 90.87%. -Accuracy of Decision Tree is 90.67% chased by K-stair with 89.97%.
-Bagging records accuracy of 88.36% followed by Naive Bayes with a count of 77.73%. Among all, SVM exceeds in all the parameters of accuracy, precision and recall. The pictorial representation of relative performances of efficacy measures for the above machine learning algorithms is depicted in Figure 3.

CONCLUSION
As a common man or the target audience is the beholder of a policy, hence considering their ideas or views in the evaluation process is essentially required. The people perspective helps the policy makers in deciding the future prospects, corrective and preventive measures recommended for policy in order to make it a success. Therefore, the aim of this paper was to induct the concept of opinion mining for policy evaluation as a smart city technology solution for sustainable governance. To exemplify the model, a latest governmental scheme, Pradhan Mantri Sahaj Bijli Har Ghar Yojana or Saubhagya (Household Electricity service) is considered. Government Intelligence leverages opinion mining to include citizen input to enhance the policy and decision-making cycle. It helps governments and organizations in resource management, strengthens political measures/mechanisms, revolutionizes the style of delivery of basic services to establish a focused model of governance for people. The scope of government information includes strategy, consultative participation, lobbying etc.
The scheme is evaluated by analyzing public sentiments extracted from tweets by applying opinion mining techniques. A set of supervised machine learning techniques have been applied and compared on the basis of accuracy, precision and recall namely, support vector machine, k-nearest neighbour, multilayer perceptron, random forest, linear regression, decision tree, K-star, Bagging, naive bayes and adaboost. Following conclusions have been drawn based on this research work: (a) 63% of the tweets comes out to be positive signifies a favourable response of people towards this policy, (b) 30% of the tweets are neutral which includes lot of informational tweets posted by civic or media regarding latest updates of scheme or its implementation process, (c) 7% of the tweets results in negative, (d) SVM outperforms among all the implemented supervised machine learning techniques with an accuracy of 91.77, (e) kNN and MLP shares the next level with an accuracy of 91.47 followed by RF, LR, DT, K-star, Bagging, NB and Adaboost, (f) Adaboost being the lowest with an accuracy of 72.41 The results are promising and validate the use of opinion mining as a smart technology solution for government policy evaluation. There is a huge scope of evaluating more policies to build a smart city tool. Moreover, the process of evaluation can be enhanced by improving the task of sentiment classification using soft computing & evolutionary techniques.

BIOGRAPHIES OF AUTHORS
Hena Iqbal have completed the doctorate degree (Ph.D.) from India in the year 2015 and have been teaching for the past many years in UAE. She has around 10 years of Academic teaching experience. As a teacher, her main goal is to motivate students to do their best and extend their own personal limits. She devises programs, according to the syllabus requirements, that expand on previous knowledge and encourage students to explore new and interesting possibilities. She has organized various workshops and seminars and has given many guest lectures. On each occasion she had worked and managed to demonstrate excellent planning, communication and team work skills. She is very active in her research work. She has more than 10 research papers publications in referred journals and international conferences. Her research areas include software engineering, mobile application, cyber security. At present she is working as an Assistant Professor in IT department at Al Dar University College, Dubai, UAE.

Sujni Paul is a faculty in the Department of Computer Information Science in the Higher
Colleges of Technology. She has around 16 years of Academic teaching experience. She completed her Ph.D in the year 2009. Her research areas include parallel and distributed data mining, web services and technologies, big data and cloud computing. She has more than 45 research papers publications in referred journals and international conferences. At present she is supervising 3 Ph.D. research scholars and one PhD student has graduated. She has organized various workshops and conferences and has given many guest lectures. She has contributed in curriculum design and has been a member of the board of studies in various reputed organizations. She is an author of a Chapter in a book titled Distributed Data Mining. She is very keen in community service and has supervised many students in doing service related projects. She is an editorial board member and reviewer of different journals.
Dr. Khaliquzzaman Khan received his Ph.D in Statistics from Agra University, Agra, India in 2004. He received his MA in Statistics and BA (Honors) in Economics from Aligarh Muslim University, Aligarh, India in 1975 and 1978 respectively. His research interests include areas covering management especially industrial management, and quantitative finance with specific interest in security and portfolio analyses. At present he is working as an Assistant Professor in Al Dar University College, Dubai.