Online Media as a Price Monitor: Text Analysis using Text Extraction Technique and Jaro-Winkler Similarity Algorithm

Nurcahyawati, Vivine ORCID: https://orcid.org/0000-0002-6611-9974 and Mustaffa, Zuriani (2020) Online Media as a Price Monitor: Text Analysis using Text Extraction Technique and Jaro-Winkler Similarity Algorithm. In: 2020 Emerging Technology in Computing, Communication and Electronics (ETCCE). Dhaka, Bangladesh, pp. 1-6. ISBN 978-1-6654-1962-8

[img] Text
Online Media as a Price Monitor Text Analysis using Text Extraction Technique and Jaro-Winkler Similarity Algorithm.pdf - Published Version
Restricted to Registered users only

Download (829kB)

Search this title on : |

Abstract

Online media has become an essential part of everyday life in modern society. Everyone or organization is free to share their opinions and feelings about any topic on it, including information or news about commodity price fluctuations. Commodity price data from the National Strategic Price Information Center (NSPIC) website is not real-time, so it is not sufficient as a basis for monitoring commodity price fluctuations. Meanwhile, the government needs to collect data and information quickly about these price fluctuations, hence immediately strategic decisions and policies can be made to stabilize the prices. This study explores the potential function of online media by extracting the text in it and analyzing text so that it can display the commodity price data sought. The commodities used as search keywords were commodities that had the highest consumption level in 2016 in Indonesia. The texts analyzed were taken from three online media, namely Twitter, Liputan6.com, and Detik.com. It was analyzed using text extraction techniques and the application of the Jaro-Winkler algorithm to find commodity prices in the text collection. Then compare the results of text analysis with commodity prices from the NSPIC website. The experimental data were 99,007 with a data collection time of three months. From only 122 data that match the keywords, it consists of 100 training data and 22 testing data. The results of the text analysis show that the text from the Detik.com website shows the commodity prices closest to the price data from the NSPIC, while Twitter shows the farthest results. The accuracy test with the confusion matrix is 75%. Based on this research, online media texts are a viable source for monitoring commodity price fluctuations.


Export Record


Item Type: Book Section
Additional Information: https://doi.org/10.1109/ETCCE51779.2020.9350898
Uncontrolled Keywords: Fluctuations, Text analysis, Social networking (online), Training data, Media, Monitoring, Testing
Dewey Decimal Classification: 000 - Computer science, information & general works > 000 Computer science, knowledge & systems > 000 Computer science, information & general works
Divisions: Perpustakaan > Prosiding/Call for Papers
Depositing User: Agung P. W.
Date Deposited: 26 Jul 2022 14:29
Last Modified: 26 Jul 2022 14:29
URI: http://repository.dinamika.ac.id/id/eprint/6523

Download Statistics

Downloads over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Actions (login required)

View Item   View Item