Leveraging Publicly Available Information to Analyze Information Operations

Authors

  • Nico Manzonelli
  • Taylor Brown
  • Antonio Avellandea-Ruiz
  • William Bagley
  • Ian Kloo

DOI:

https://doi.org/10.37266/ISER.2021v9i2.pp142-148

Keywords:

Information Operations, Publicly Available Information, Natural Language Processing, Web Scrapping

Abstract

Traditionally, a significant part of assessing information operations (IO) relies on subject matter experts’ time- intensive study of publicly available information (PAI). Now, with massive amounts PAI made available via the Internet, analysts are faced with the challenge of effectively leveraging massive quantities of PAI to draw meaningful conclusions. This paper presents an automated method for collecting and analyzing large amounts of PAI from China that could better inform assessments of IO campaigns. We implement a multi-model system that involves data acquisition via web scraping and analysis using natural language processing (NLP) techniques with a focus on topic modeling and sentiment analysis. After conducting a case study on China’s current relationship with Taiwan and comparing the results to validated research by a subject matter expert, it is clear that our methodology is valuable for drawing general conclusions and pinpointing important dialogue over a massive amount of PAI.

References

Army Publishing Directorate. (2018, October 4). The Conduct of Information Operations. Retrieved from armypubs.army.mil/epubs/DR_pubs/DR_a/pdf/web/ARN13138_ATP%203-13x1%20FINAL%20Web%201.pdf
Barde, B., & Bainwad, A. (2017). An Overview of Topic Modeling and Tools. 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), (pp. 745-750). Madurai, India.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. J. Mach. Learn. Res, 993–1022.
eBay, Inc. v. Bidder's Edge, Inc., C-99-21200RMW (US District Court for the Northern District of California May 24, 2000).
Global Times. (2021, February 5). PLA expels trespassing US warship from Xisha Islands. Retrieved from https://www.globaltimes.cn/page/202102/1215073.shtml
Gupta, R., Besacier, L., Dymetman, M., & Galle, M. (2019). Charecter-based NMT with Transformer. arXiv: 1911.04997.
Holm, R. R. (2017, March). Natural Language Processing of Online Propoganda as a Means of Passivley Monitoring an Adversarial Ideology. Retrieved from [Master's thesis, Naval Postgraduate School]: https://apps.dtic.mil/sti/pdfs/AD1045878.pdf
Hutchins, J. W., & Somers, H. L. (1992). An Introduction to Machine Translation. London: Academic Press.
Information Operations. (2012). In Joint Publication 3-13 (p. 87). Washington D.C.
Jones, T., & Doane, W. (2019). textmineR. Retrieved from https://www.rtextminer.com/
Mastro, O. S. (2021). The Precarious State of Cross-Strait Deterrence. Statement before the U.S. China Economic and Security Review Commission on "Deterring PRC Aggression Toward Taiwan.”
Mohammad, S. M., & Turney, P. D. (2010). Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon. Los Angeles: Association for Computational Linguistics.
Mohammad, S. M., & Turney, P. D. (2013). Crowdsourcing a Word-Emotion Association Lexicon. 1308.6297.
Muthukadan, B. (2018). Selenium with Python. Retrieved from https://selenium-python.readthedocs.io/
Rees, B. (2018). Dismantling Contemporary Military Thinking and Reconstructing Patterns of Information: Thinking Deeper About Future War and Warfighting. Small Wars Journal, smallwarsjournal.com/jrnl/art/dismantling-contemporary-military-thinking-and-reconstructing-patterns-information.
Richardson, L. (2020). Beautiful Soup Documentation. Retrieved from https://www.crummy.com/software/BeautifulSoup/bs4/doc/
Rinker, T. (2019). sentimentr. Retrieved from https://github.com/trinker/sentimentr
Shumei, L., & Lin, W. (2021, February 18). Taiwan island's intensive military exercises a political show to cover its weakness: analysts. Retrieved from Global Times: https://www.globaltimes.cn/page/202102/1215898.shtml
Sievert, C., & Shirley, K. (2014). Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. (pp. 63-70). Baltimore: Association for Computational Linguistics.
Smith, S. T., Kao, E. K., Mackin, E. D., Shah, D. C., Simek, O., & Rubin, D. B. (2021). Automatic detection of influential actors in disinformation networks. National Academy of Sciences (pp. 118-122). DOI: 10.1073/pnas.2011216118.
Xuanzun, L. (2020, December 22). PLA expels US warship trespassing South China Sea. Retrieved from Global Times: https://www.globaltimes.cn/page/202012/1210657.shtml
Xuanzun, L. (2021, January 27). Taiwan's display of new missile 'wrongly boosts courage of secessionists'. Retrieved from Global Times: https://www.globaltimes.cn/page/202101/1214177.shtml
Yin, J., & Wang, J. (2014). A dirichlet multinomial mixture model-based approach for short text clustering. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.

Published

2022-01-15

How to Cite

Manzonelli, N., Brown, T., Avellandea-Ruiz, A., Bagley, W., & Kloo, I. (2022). Leveraging Publicly Available Information to Analyze Information Operations. Industrial and Systems Engineering Review, 9(2), 142-148. https://doi.org/10.37266/ISER.2021v9i2.pp142-148