Tech Science Press
Not a member yet
3972 research outputs found
Sort by
Using Imbalanced Triangle Synthetic Data for Machine Learning Anomaly Detection
The extreme imbalanced data problem is the core issue in anomaly detection. The amount of abnormal data is so small that we cannot get adequate information to analyze it. The mainstream methods focus on taking fully advantages of the normal data, of which the discrimination method is that the data not belonging to normal data distribution is the anomaly. From the view of data science, we concentrate on the abnormal data and generate artificial abnormal samples by machine learning method. In this kind of technologies, Synthetic Minority Over-sampling Technique and its improved algorithms are representative milestones, which generate synthetic examples randomly in selected line segments. In our work, we break the limitation of line segment and propose an Imbalanced Triangle Synthetic Data method. In theory, our method covers a wider range. In experiment with real world data, our method performs better than the SMOTE and its meliorations
Differentially Private Real-Time Streaming Data Publication Based on Sliding Window Under Exponential Decay
Continuous response of range query on steaming data provides useful information for many practical applications as well as the risk of privacy disclosure. The existing research on differential privacy streaming data publication mostly pay close attention to boosting query accuracy, but pay less attention to query efficiency, and ignore the effect of timeliness on data weight. In this paper, we propose an effective algorithm of differential privacy streaming data publication under exponential decay mode. Firstly, by introducing the Fenwick tree to divide and reorganize data items in the stream, we achieve a constant time complexity for inserting a new item and getting the prefix sum. Meanwhile, we achieve time complicity linear to the number of data item for building a tree. After that, we use the advantage of matrix mechanism to deal with relevant queries and reduce the global sensitivity. In addition, we choose proper diagonal matrix further improve the range query accuracy. Finally, considering about exponential decay, every data item is weighted by the decay factor. By putting the Fenwick tree and matrix optimization together, we present complete algorithm for differentiate private real-time streaming data publication. The experiment is designed to compare the algorithm in this paper with similar algorithms for streaming data release in exponential decay. Experimental results show that the algorithm in this paper effectively improve the query efficiency while ensuring the quality of the query
Leveraging Logical Anchor into Topology Optimization for Indoor Wireless Fingerprinting
The indoor subarea localization has wide application space in dynamic hot zone identification, indoor layout optimization, store dynamic pricing and crowd flow trend prediction. The ubiquitous mobile devices provide the opportunity for wireless fingerprinting-based indoor localization services. However, there are two short board where the existing methods have been criticized. One is that a tagging approach requires a large number of professional surveys for wireless fingerprint construction, which weakens the scalability of the methods. The other is that the crowdsourcing-based methods encounter the cold boot problem in the system initial stage. To address these issues, the paper proposes a topology optimization approach leveraging the dynamic logical anchor selection into a subarea localization system. First of all, a new annular-based radio map construction strategy with the feedback selection of logic anchor is designed to release the pressure of site survey. The implementation of this strategy harnesses the characteristics of the indoor building structure and inter subarea overlapping recognition, without the topology and distribution of physical anchor (e.g., access points or POIs). Secondly, exploiting the probabilistic support vector machine algorithm, the target is localized in the corresponding subarea in a real-time pattern. Furthermore, the localization error is calibrated with an error recognition algorithm. Finally, massive experiments are implemented on a prototype system. The results show that the proposed method can decrease the overhead of the system initialization and achieve higher localization accuracy compared with the existing approaches
Forecasting Model Based on Information-Granulated GA-SVR and ARIMA for Producer Price Index
The accuracy of predicting the Producer Price Index (PPI) plays an indispensable role in government economic work. However, it is difficult to forecast the PPI. In our research, we first propose an unprecedented hybrid model based on fuzzy information granulation that integrates the GA-SVR and ARIMA (Autoregressive Integrated Moving Average Model) models. The fuzzy-information-granulation-based GA-SVR-ARIMA hybrid model is intended to deal with the problem of imprecision in PPI estimation. The proposed model adopts the fuzzy information-granulation algorithm to pre-classification-process monthly training samples of the PPI, and produced three different sequences of fuzzy information granules, whose Support Vector Regression (SVR) machine forecast models were separately established for their Genetic Algorithm (GA) optimization parameters. Finally, the residual errors of the GA-SVR model were rectified through ARIMA modeling, and the PPI estimate was reached. Research shows that the PPI value predicted by this hybrid model is more accurate than that predicted by other models, including ARIMA, GRNN, and GA-SVR, following several comparative experiments. Research also indicates the precision and validation of the PPI prediction of the hybrid model and demonstrates that the model has consistent ability to leverage the forecasting advantage of GA-SVR in non-linear space and of ARIMA in linear space
Detecting Iris Liveness with Batch Normalized Convolutional Neural Network
Aim to countermeasure the presentation attack for iris recognition system, an iris liveness detection scheme based on batch normalized convolutional neural network (BNCNN) is proposed to improve the reliability of the iris authentication system. The BNCNN architecture with eighteen layers is constructed to detect the genuine iris and fake iris, including convolutional layer, batch-normalized (BN) layer, Relu layer, pooling layer and full connected layer. The iris image is first preprocessed by iris segmentation and is normalized to 256×256 pixels, and then the iris features are extracted by BNCNN. With these features, the genuine iris and fake iris are determined by the decision-making layer. Batch normalization technique is used in BNCNN to avoid the problem of over fitting and gradient disappearing during training. Extensive experiments are conducted on three classical databases: the CASIA Iris Lamp database, the CASIA Iris Syn database and Ndcontact database. The results show that the proposed method can effectively extract micro texture features of the iris, and achieve higher detection accuracy compared with some typical iris liveness detection methods
Online Magnetic Flux Leakage Detection System for Sucker Rod Defects Based on LabVIEW Programming
Aiming at the detection of the sucker rod defects, a real-time detection system is designed using the non-destructive testing technology of magnetic flux leakage (MFL). An MFL measurement system consists of many parts, and this study focuses on the signal acquisition and processing system. First of all, this paper introduces the hardware part of the acquisition system in detail, including the selection of the Hall-effect sensor, the design of the signal conditioning circuit, and the working process of the single chip computer (SCM) control serial port. Based on LabVIEW, a graphical programming software, the software part of the acquisition system is written, including serial port parameter configuration, detection signal recognition, original signal filtering, real-time display, data storage and playback. Finally, an experimental platform for the MFL detection is set up, and the MFL measurement is carried out on the transverse and longitudinal defects of the sucker rod surface. The experimental result shows that the designed acquisition and processing system has good detection performance, simple design and high flexibility
Tibetan Sentiment Classification Method Based on Semi-Supervised Recursive Autoencoders
We apply the semi-supervised recursive autoencoders (RAE) model for the sentiment classification task of Tibetan short text, and we obtain a better classification effect. The input of the semi-supervised RAE model is the word vector. We crawled a large amount of Tibetan text from the Internet, got Tibetan word vectors by using Word2vec, and verified its validity through simple experiments. The values of parameter α and word vector dimension are important to the model effect. The experiment results indicate that when α is 0.3 and the word vector dimension is 60, the model works best. Our experiment also shows the effectiveness of the semi-supervised RAE model for Tibetan sentiment classification task and suggests the validity of the Tibetan word vectors we trained
Multi-Rate Polling: Improve the Performance of Energy Harvesting Backscatter Wireless Networks
In recent years, Researchers have proposed the concept of Energy Harvesting Backscatter Wireless Networks (EHBWN). EHBWN usually consists of one sink and several backscatter nodes. Backscatter nodes harvest energy from their environment and communicate with sink through backscattering the carrier wave transmitted by sink. Although a certain amount of access protocols for Energy Harvesting Wireless Networks have been present, they usually do not take the sink’s receiver sensitivity into account, which makes those protocols unsuitable in practice. In this paper, we first give an analysis of the backscatter channel link budget and the relationship between the effective communication range and uplink data rate. After that, we point out that a single uplink data rate for all the backscatter nodes is no longer suitable due to the constraint of sink receiver sensitivity. Later we propose Multi-rate Polling which divides the network into different uplink data rata regions to make sure the correct packet reception by the sink and improve the network performance. Multi-rate Polling also introduces a parameter K, through adjusting it, we can achieve the trade-off between network throughput and fairness to meet the requirement under various scenarios. We validate Multi-rate Polling under different networks and average harvesting rates through simulation. The result shows that the proposed protocol can effectively improve the network performance and has excellent scalability, which makes it suitable for EHBWN
YATA: Yet Another Proposal for Traffic Analysis and Anomaly Detection
Network traffic anomaly detection has gained considerable attention over the years in many areas of great importance. Traditional methods used for detecting anomalies produce quantitative results derived from multi-source information. This makes it difficult for administrators to comprehend and deal with the underlying situations. This study proposes another method to yet determine traffic anomaly (YATA), based on the cloud model. YATA adopts forward and backward cloud transformation algorithms to fuse the quantitative value of acquisitions into the qualitative concept of anomaly degree. This method achieves rapid and direct perspective of network traffic. Experimental results with standard dataset indicate that using the proposed method to detect attacking traffic could meet preferable and expected requirements
A New Model for the Characterization of Frozen Soil and Related Latent Heat Effects for the Improvement of Ground Freezing Techniques and Its Experimental Verification
The correct determination of thermal parameters, such as thermal conductivity and specific heat of soil during freezing, is the most important and basic problem for the construction of an appropriate freezing method. In this study, a calculation model of three stages of soil temperature was established. At the unfrozen and frozen stages, the specific temperatures of dry soil, water, and ice are known. According to the principle of superposition, a calculation model of unfrozen and frozen soils can be established. Informed by a laboratory experiment, the latent heat of the adjacent zone was calculated for the freezing stage based on different water contents in the temperature section. Both the latent and specific heat of water, ice, and particles were calculated via superposition of the weight percentage content. A calculation model of the specific heat of the freezing stage was built, which provides both guidance and theoretical basis for the calculation of the specific heat of frozen soil