 
								Research on Feature Selection in Power User Identification
								
									
										
											
											
												Qiu Yanhao,
											
										
											
											
												Song Xiaoyu,
											
										
											
											
												Sun Xiangyang,
											
										
											
											
												Zhao Yang
											
										
									
								 
								
									
										Issue:
										Volume 3, Issue 3, May 2018
									
									
										Pages:
										67-76
									
								 
								
									Received:
										18 April 2018
									
									Accepted:
										7 May 2018
									
									Published:
										1 June 2018
									
								 
								
								
								
									
									
										Abstract: In the previous study of user identification, most of the researchers improved the recognition algorithm. In this paper, we use large data technology to extract electricity feature from different angles and study the impact of different features on recognition. Firstly, the raw data was cleaned. In order to obtain the key information of power theft user identification, the features of the data set are extracted from three aspects: basic attribute feature, statistical feature under different time scale and similarity feature under different time scale. Then we use feature sets of different combinations to carry out experiments under the KNN model, the random forest (RF) model and the XGBoost model. The experimental results show that the experimental results of the BF+SF+PF feature set in the three classifiers are obviously better than the other two feature sets. Therefore, it is concluded that different features have obvious effects on the recognition results.
										Abstract: In the previous study of user identification, most of the researchers improved the recognition algorithm. In this paper, we use large data technology to extract electricity feature from different angles and study the impact of different features on recognition. Firstly, the raw data was cleaned. In order to obtain the key information of power theft...
										Show More