Towards automatic setup of non intrusive appliance load monitoring – feature extraction and clustering

,


INTRODUCTION
Soaring energy prices and growing global warming concerns have triggered governments and policy makers to promote efficient energy systems that can save considerable amounts of energy and, as a result, reduce carbon emissions (e.g., [1], [2]). In 2007, the European Council set the following ambitious targets to be achieved by 2020: reducing emissions of greenhouse gases by 20% with respect to the 1990 levels, improving energy efficiency with the aim of saving 20% of the European Union energy consumption, and raising the renewable energy sources share to 20% [3]. Load monitoring as a part of energy management system can substantially reduce energy consumption of a certain household [4]. Indeed, a detailed review of more than 60 feedback studies suggests that considerable energy saving can be achieved using real-time appliance level consumption information as opposed to monthly bills or weekly advice on energy consumption [5]. Load monitoring can also provide critical information for the demand response programs offered by utility companies that improves the load forecasting accuracy and provides a convenient platform for utility companies to realize a real-time pricing [6], [7].
One method of load monitoring is distributed direct sensing, which requires a sensor at each device or appliance in order to measure consumption. Although conceptually straightforward and potentially highly accurate, direct sensing is often expensive due to time consuming installation and the requirement for one sensor for each device or appliance. In response to limitations with the direct sensing approach, researchers have explored methods to infer dis-Journal Homepage: http://www.iaescore.com/journals/index.php/IJECE IJECE ISSN: 2088-8708 1003 aggregated energy usage via a single sensor. Pioneering work in this area is non-intrusive appliance load monitoring, first introduced by George Hart in the late 1980s [8]. In contrast to the direct sensing methods, NIALM relies solely on single-point measurements of voltage and current on the power feed entering the household. NIALM consists of four steps: data acquisition, event detection, feature extraction, and event classification. The raw current and voltage waveforms are transformed into a feature vector, i.e. a more compact and meaningful representation that may include real power, reactive power, current-voltage phase difference, and harmonics (e.g., [9], [10]). These extracted features are monitored for changes, identified as events (e.g., an appliance turning "on" or "off"), and classified down to the appliance or device category level using a classification algorithm, which usually compares the features to a preexisting database of signatures. Several reviews and comparative studies of feature extraction and classification methods for electric loads in residential and commercial buildings can be found in the literature [11][12][13][14].
Depending on the degree of non-intrusiveness, the literature differentiates between manual-setup NIALM (MS-NIALM) and automatic-setup NIALM (AS-NIALM) systems. The manual setup is a non-intrusive appliance behavior tracker that requires a one-time intrusive period for setup. During the intrusive setup period, individual appliances are manually switched on and off to learn their signatures. On the other hand, the automatic setup is a process that sets itself up using prior information about potential appliances. AS-NIALM hence extracts the signatures and labels them without any sort of manual intervention, which would greatly facilitate mass installation of smart meters. To the author's knowledge, no AS-NIALM system has hitherto been implemented. It is hence the main goal of this work to pave the way for such a solution.
In this paper, the first two modules of an AS-NIALM system are presented. The system consists of the following four modules: feature extraction, clustering, labeling and classification. The feature extraction module applies the ESPRIT method, a well-known subspace-based estimation technique, to the drawn electric current. The result is a compact representation of the current in terms of complex numbers referred to as poles and residues [15,16]. These complex numbers are shown to be characteristic of the considered load and thus can serve as features for the subsequent classification layer. In [17], [18], poles and residues of current signals were estimated using the matrix pencil method. For both synthetic and real data, results indicate that poles and residues extracted by ESPRIT allow an almost perfect reconstruction of drawn electric currents. Once a signature is extracted, the clustering module applies distance-based rules inferred offline from various databases and decides either to create a new class out of the new signature or to discard it and increase the count of an existing signature. Signatures can hence be analyzed instantly and are not stored in memory. As a result, singleton clusters are formed without the a priori knowledge of the number of appliances. Results obtained from a database of a household indicate that these two modules succeed in distinguishing signatures of different appliances.
The objectives of this paper are summarized in the following three points: 1. show that the reduced number of poles and residues estimated by ESPRIT enable an accurate reconstruction of synthetic and real signals 2. show that the fundamental and higher harmonic currents determined from poles and residues yield a feature space with reduced inter-cluster overlap 3. propose a rule-based technique that assigns a set of signatures into singleton clusters.
The rest of the paper is organized as follows. Section 2 presents the signal model, the covariance matrix, the principle of ESPRIT and the validation on simulated and real data. In Section 3, the feature space is given. Sections 4 explains the rule-based clustering approach and evaluates its performance on several load combinations. Finally, section 5 provides the summary and conclusion.

MODEL-BASED FEATURE EXTRACTION 2.1. Signal Model
For a sinusoidal driving voltage of the form v(t) = V √ 2 sin(ωt), the drawn electric current can be modeled as a linear combination of d cisoids (complex-valued sinusoidal signals) weighted by complex residues according to the following damped exponential signal model: where r m is the residue of the m th cisoid, α m is its attenuation factor, f m is its frequency, and b(t) is additive white Gaussian noise with zero mean and variance 2σ 2 . After sampling, the time variable, t, is replaced by t k = kt s , where ISSN: 2088-8708 t s = 6.25 × 10 −4 s is the chosen sampling period. The discrete current signal becomes: where is the m th complex pole. Under matrix form, the signal model is expressed by: with the following notational definitions: The superscript T denotes the transpose operator. The feature extraction problem can now be stated as follows. Given the electric current data sequence {i(k)}

The Covariance Matrix
The N × N covariance matrix of the data vector i is defined by: where E and H are the expectation and the conjugate-transpose operators, respectively. Using equation (4) and developing, R can be written as where R i and 2σ 2 I denote the covariance matrices of the cisoids and noise, respectively. In practice, R is estimated from M independent realizations or snapshots of the data vector i and is termed the sample covariance matrix The covariance matrix is Hermitian and positive semi-definite. According to the spectral theorem [19], every Hermitian matrix can be diagonalized by a unitary matrix (a matrix is said to be unitary if and only if its inverse is equal to its conjugate transpose) and that the resulting diagonal matrix has only real entries. This means that all eigenvalues of R are real and, due to its positive semi-definiteness, nonnegative. More importantly, eigenvectors with distinct eigenvalues are orthogonal. It is this remarkable property of orthogonality that forms the basis of subspace-based estimation methods. Being Hermitian, R admits the following eigenvalue decomposition (EVD): where U is the unitary matrix of eigenvectors, λ 1 ≥ λ 2 ≥ · · · ≥ λ d > λ d+1 = · · · = λ N = 2σ 2 are the eigenvalues, and U i = [u 1 . . . u d ] and U n = [u d+1 . . . u N ] contain the eigenvectors associated with the signal (largest d) and noise (smallest N − d) eigenvalues, respectively. Indeed, the subspace spanned by the eigenvectors of U i is termed the signal subspace, whereas the subspace spanned by the eigenvectors of U n is termed the noise subspace.

Estimation of Signal Parameters via Rotational Invariance Techniques
The basic idea behind ESPRIT is to exploit the rotational invariance of the underlying signal subspace induced by the Vandermonde structure of the mode matrix A without having to know A [20], [21]. In particular, this rotational invariance can be established by partitioning A into two adjacent frequency sub-bands, preferably with maximum overlapping. The reason for maximum sub-band overlapping is to maintain the resolution capability of the algorithm. For both the damped and undamped exponential models, it is then easily verified that the following relation holds: where A 1 and A 2 are (N − 1) × d dimensional matrices obtained from A by removing its last and first rows, respectively: and Φ is a d × d full rank diagonal matrix that gathers the signal poles: Therefore, in contrast to polynomial-rooting-based methods such as root-MUSIC, ESPRIT can be adapted to parameter estimation of the damped exponential model representing the drawn electric current since the shift invariance structure is maintained. It is worth noting that in the case of the undamped exponential model (unit-modulus cisoids) z * = 1/z and Φ becomes a unitary matrix since ΦΦ H = Φ H Φ = I. However, for the damped exponential case, the complex numbers on the diagonal of Φ have moduli less than one and Φ becomes a contractive operator. The objective of ESPRIT is hence to estimate the diagonal elements of Φ from data samples. Since the subspace spanned by the mode vectors of A is equal to the subspace spanned by the eigenvectors of U i (the signal subspace) defined in section (2.2.), there must exist a unique, Partitioning U i in the same way as done for A gives Using equation (17) allows to express U i1 and U i2 as Equation (19) implies Replacing the above equation in equation (20) yields which can also be expressed as where Therefore, Φ and Ψ are similar matrices and the sought diagonal elements of Φ are the eigenvalues of Ψ. The estimation problem then amounts to estimating Ψ from U i and computing its eigenvalues. Equation (24) is the key relationship in the development of ESPRIT and its properties. In practice, the covariance matrix is estimated from a finite number of noisy snapshots as described in section (2.2.). The result is that the subspace spanned by the eigenvectors ofÛ i is only an estimate of the signal subspace and different from the subspace spanned by the mode vectors of A. Moreover, the spans of the eigenvectors ofÛ i1 andÛ i2 are different. Thus, the objective of finding a matrix Ψ that satisfies equation (23) is no longer achievable.

ISSN: 2088-8708
A commonly employed criterion for such type of problems is the least-squares criterion that yields the following approximate solution for Ψ:Ψ Knowing that bothÛ i1 andÛ i2 contain errors, estimating Ψ using the total least-squares (TLS) criterion is more appropriate. The TLS solution is given by: where the d × d dimensional matrices W 12 and W 22 are determined from a matrix W that results from the singular value decomposition of an (N − 1) × 2d dimensional matrix V = [Û i2Ûi1 ] as follows:

Linear Loads
To validate ESPRIT as a feature extraction method, its poles and residues were first compared with those obtained from the theoretical expressions of the following linear elementary loads: series RC, series RL, parallel RL and series RLC. The RC and RL circuits lead to first order differential equations in time, whereas the RLC circuit leads to a second order differential equation. Using Euler's formula and rearranging allow to rewrite the current expression obtained from the solution of the differential equation characterizing the load in the form of equation (1). The poles and residues of each elementary load can then be readily identified. Table 1 gives the residues, attenuation factors, frequencies, and dependent parameters of the four studied elementary loads. As can be seen from this table, first order circuits (RL and RC) are characterized by two purely imaginary conjugate poles representing their forced response and one real pole representing their natural response, whereas the second order circuit (RLC) has, beside the two purely imaginary conjugate poles of its forced response, two conjugate complex poles related to its natural response.
The dependent parameters, A and B, of the RLC circuit are expressed in terms of A 1 and A 2 given by:

Nonlinear Loads
A nonlinear load is one for which the relationship between the current through the load and the voltage across the load is a nonlinear function. A simple view of the nature of nonlinear loads can be presented using Ohm's Law, IJECE ISSN: 2088-8708 1007 which states that the voltage is the product of the load resistance and the current (V = RI). For a linear load, the resistance (R) is a constant; for a nonlinear load, the resistance varies. When AC power is supplied to a nonlinear load, the result is the creation of currents that do not oscillate at the supply frequency. These currents are called harmonics.
Harmonics occur at multiples of the supply (fundamental) frequency. For instance, if the fundamental frequency is 50 Hz, the so-called second harmonic is 100 Hz, the third harmonic is 150 Hz, and so on. Any number of harmonics can be created by a particular piece of equipment depending on that equipment's electrical characteristics. Therefore, the current drawn by nonlinear loads can be still represented by equation 1 where harmonics appear in the form of pole-residue couples at frequencies multiples of 50 Hz.

Results
Assuming zero initial conditions (i L0 = 0 and/or v C0 = 0), the following numerical values were used to determine the electric current data sequence from which ESPRIT extracted poles and residues: {R = 100 Ω, C = 0.1 mF} for the series RC circuit, {R = 10 Ω, L = 100 mH} for both the series and parallel RL circuits, and {R = 1 Ω, L = 20 mH, C = 60 mF} for the series RLC circuit. A duration of ten periods or 0.2 s was chosen for the current which, at t s = 6.25 × 10 −4 s, is equivalent to 320 samples, and ESPRIT was applied at each period. Figures 1a-1d show the current obtained from the analytic expression of poles and residues in table 1 and its reconstruction obtained from the poles and residues extracted by ESPRIT. An almost perfect agreement can be seen between the two curves indicating the accuracy of the characteristic complex numbers extracted by ESPRIT. In addition, the figures show the forced and natural responses of each of the four elementary circuits.
To evaluate the performance of ESPRIT on nonlinear loads, a current consisting of a fundamental (I 1 = 100%) and four harmonics (I 5 = 18.9%, I 7 = 11%, I 11 = 5.9%, and I 13 = 4.8%) was considered. This current can be represented by ten pairwise complex conjugate pole-residue couples. ESPRIT was then used to extract these ten couples, which served to reconstruct the current as shown in Figure 2. As can be seen, ESPRIT is successful in estimating the pole-residue couples of the load.

Validation on Real Data
In this section, the validation of ESPRIT is carried out on currents of three representative loads: a television set, a vacuum cleaner, and an economy lamp. As for the case of synthetic data, ESPRIT was applied at each period. Figures 3a-3c show the current drawn by the appliances and its reconstruction based on the pole-residue estimates of ESPRIT. The close agreement shown in the figures indicates that the exponential model of equation 1 and its parameters estimated by the ESPRIT accurately predict the response of the actual loads. It is worth mentioning that the number of pole-residue couples d increases with the nonlinearity of the load. For instance, the current of the vacuum cleaner could be accurately reconstructed from four pole-residue couples, whereas that of the economy lamp needed up to twelve couples.

FEATURE SPACE
The feature space contains 900 signatures uniformly distributed among the following nine appliances: incandescent lamp, halogen lamp, economy lamp, water heater, electric convector, oven, two-burner hot plate, television set, and computer. As shown in Figure 4, each signature (represented by a point in the the three-dimensional feature space) is characterized by three pole-residue products corresponding to the maxima of the fundamental, third and fifth harmonic currents. The restriction to three frequencies has the sole aim of representing the feature space graphically. From the feature space, ten clusters representing the nine appliances can be clearly distinguished. The additional cluster is due to the two-burner hot plate which is represented by two clusters, one for each burner. It can hence be concluded that the studied appliances can be fairly distinguished using the fundamental and higher harmonics.

RULE-BASED CLUSTERING
Rule-based clustering is accomplished by applying specific knowledge rather than specific technique. This is a key idea in expert systems technology. It reflects the belief that human experts do not process their knowledge differently from others, but they do possess different knowledge. With this philosophy, when one finds that their expert system does not produce the desired results, work begins to expand the knowledge base, not to re-program the procedures.
The clustering module has the task of reducing the feature space into singleton clusters. Moreover, applying clustering prior to classification has been recently shown to improve the accuracy of classification methods [22]. Forming singleton clusters is rather challenging given the cluster dispersion of some appliances and the unknown number of clusters. To this end, the clustering module calculates the Euclidean distance between the first three elements of the present signature and those of each of the signatures in the unlabeled database. Based on the following rules inferred offline by a human expert from several household appliances, the clustering module then decides either to create a new unlabeled signature in the database or to increase the count of an existing signature: 1: if the unlabeled database is empty then The outcome of this phase is a compact database of unlabeled signatures to which additional features such as the active power and the current-voltage phase difference can be added. In other words, the unlabeled database is an M × N array where M is the number of identified signatures and N is the number of features in each signature. Applying the clustering module to the 900 signatures of the feature space shown in Figure 4 yields nine singleton clusters depicted as blue squares in Figure 5. It can be noticed that the clustering module represented the two clusters corresponding to the electric convector and the one-burner hotplate by one cluster due to their heavy overlap, thus missing the tenth cluster.

CONCLUSION
This paper discussed the first two modules of an automatic-setup NIALM system. First, the feature extraction module employed ESPRIT to extract complex poles and residues from the electric current. These complex numbers were then used to determine a feature space with reduced inter-cluster overlap. Second, a clustering approach based on rules derived offline by a human expert was proposed to assign a set of signatures into singleton clusters. Results indicated that these two modules succeed in creating an unlabeled database of signatures. Further research would be to update the clustering rules through testing the developed modules on other datasets.