Exploración del poder predictivo de datos extraídos de StockTwits respecto a la dirección de variación futura del precio de un activo transado en la Bolsa de Valores de Nueva York,
Predictive power exploration of the data extracted from StockTwits over the future price variation direction of a stock traded in the New York Stock Market,
Resumen (es)
Diariamente se generan grandes volúmenes de información, especialmente en las redes sociales. El uso de esta información como insumo para el estudio del comportamiento de los agentes en el mercado de valores ha venido cobrando fuerza, especialmente en el campo del aprendizaje de máquina. Es por ello que, en este artículo se presenta un estudio de la capacidad predictiva de la información que generan los agentes del mercado en la red social StockTwits sobre la variación de la dirección del precio de una activo transado en la Bolsa de Valores de Nueva York, valiéndose de herramientas de minería de datos y algoritmos de aprendizaje de máquina.
Resumen (en)
High volume of data is generated daily, especially on social networks. The usage of this data as a source in the study of the agent’s behavior in the stock market have been gaining interest, specifically in the machine learning field. Hence, in this article; a study about the predictive power of this kind of data over the future price variation direction of a stock is made, using the texts published in the StockTwits social network and machine learning techniques.
Referencias
Balaji, S. N., Paul, P. V., \& Saravanan, R. (2017, April). Survey on sentiment analysis based stock prediction using big data analytics. In 2017 Innovations in Power and Advanced Computing Technologies (i-PACT) (pp. 1-5). IEEE.
Beysolow II, T. Applied Natural Language Processing with Python.
Copestake, A. (2005). Natural language processing. Lecture Notes, Computer Laboratory, University of Cambridge.
Coyne, S., Madiraju, P., \& Coelho, J. (2017, November). Forecasting Stock Prices Using Social Media Analysis. In Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence \& Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), 2017 IEEE 15th Intl (pp. 1031-1038). IEEE.
Dale, R., Moisl, H., \& Somers, H. (2000). Handbook of natural language processing. CRC Press.
Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82-89.
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232.
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics \& Data Analysis, 38(4), 367-378.
Friedman, J., Hastie, T., \& Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics, 28(2), 337-407.
Friedman, J. H., \& Popescu, B. E. (2003). Importance sampled learning ensembles. Journal of Machine Learning Research, 94305.
Hebb, D. O. (1949). The organization of behavior.
Izenman, A. J. (2008). Modern multivariate statistical techniques. Regression, classification and manifold learning.
Jurafsky, D. (2000). Speech and language processing: An introduction to natural language processing. Computational linguistics, and speech recognition.
Kimoto, T., Asakawa, K., Yoda, M., \& Takeoka, M. (1990, June). Stock market prediction system with modular neural networks. In Neural Networks, 1990., 1990 IJCNN International Joint Conference on (pp. 1-6). IEEE.
Kooijman, J. F. (2014). Stock market prediction using social media data and finding the covariance of the LASSO.
LeBaron, B. (2006). Agent-based computational finance. Handbook of computational economics, 2, 1187-1233.
McCulloch, W. S., \& Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133.
Marquez, L. (2000). Machine learning and natural language processing. In Complementary documentation for the conference “Aprendizaje automático aplicado al procesamiento del lenguaje natural”, Soria.
McEnery, T., Xiao, R., \& Tono, Y. (2006). Corpus-based language studies: An advanced resource book. Taylor \& Francis.
Nausheen, S., Kumar, A., \& Amrutha, K. K. SURVEY ON SENTIMENT ANALYSIS OF STOCK MARKET.
ISO 690
Oh, C., \& Sheng, O. (2011, December). Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement. In Icis (pp. 1-19).
Oliveira, N., Cortez, P., \& Areal, N. (2013, September). On the predictability of stock market behavior using stocktwits sentiment and posting volume. In Portuguese Conference on Artificial Intelligence (pp. 355-365). Springer, Berlin, Heidelberg.
Preethi, G., \& Santhi, B. (2012). STOCK MARKET FORECASTING TECHNIQUES: A SURVEY. Journal of Theoretical \& Applied Information Technology, 46(1).
Samanidou, E., Zschischang, E., Stauffer, D., \& Lux, T. (2007). Agent-based models of financial markets. Reports on Progress in Physics, 70(3), 409.
Stockinger, N., \& Dutter, R. (1987). Robust time series analysis: A survey. Kybernetika, 23(7), 1-3.
Theeramunkong, T., Kongkachandra, R., \& Supnithi, T. Advances in Natural Language Processing, Intelligent Informatics and Smart Technology.
Tsui, D. (2017). Predicting Stock Price Movement Using Social Media Analysis. Stanford University, Technical Report.
Werbos, P.J. (1974). Beyond regression: new tools for prediction and analysis in the behavioral sciences, Ph.D. dissertation, Harvard University.
Zhang, X., Fuehres, H., \& Gloor, P. A. (2011). Predicting stock market indicators through twitter “I hope it is not as bad as I fear”. Procedia-Social and Behavioral Sciences, 26, 55-62.
Cómo citar
Licencia
Los autores mantienen los derechos sobre los artículos y por tanto son libres de compartir, copiar, distribuir, ejecutar y comunicar públicamente la obra bajo las condiciones siguientes:
Reconocer los créditos de la obra de la manera especificada por el autor o el licenciante (pero no de una manera que sugiera que tiene su apoyo o que apoyan el uso que hace de su obra).
Comunicaciones en Estadística está bajo una licencia Creative Commons Atribución-NoComercial-CompartirIgual 4.0 Internacional (CC BY-NC-SA 4.0)