Thursday, December 31, 2020

System Architecture for “Analytical framework for Social media Content Moderation through Crowdsourcing"


1.     Register contributors
A Facebook page will be created to increase the visibility of the app and to find contributors. OPEN-ID will be used to register contributors via Google, FB, etc. Any social media user will be allowed to register. But those who can read Sinhala and those who are from Sri Lanka will be selected as contributors. Those who are not using social media will not be allowed to contribute as crowd participants.

2.     Eliminate contributors
Those who do not qualify at the pre-selection process, fail at quality control, fail at the trustworthiness the process will be eliminated by being a contributor in the platform.

3.     Pre-Selection of contributors
The pre-selection process would involve checking the qualification (age, demographic, Sinhala and
Singlish literacy,etc.), context-specific, trustworthiness based, persona-based and without considering
any of these.

4.     Question Selection
Implement a question generation mechanism and then generate a questionnaire to generate a corpus with hate speech, to annotate the posts to detect and classify the intention of the hate speech.

5.     Fire Questions
Fire questions to selected contributors based on pre-selected contributors in using a question firing mechanism such as a decision tree. Some categories of such questions would be to check the credibility of information shared in social media, to check fake profiles, to capture user opinion and behaviour. Generate advanced questions as given below after aggregating the responses from the contributors.
E.g. This is a flagged content as hate speech. But x number of people have shared it. Why do you think people keep sharing it?
·        Because it is fun
·        Sarcasm
·        Irony
6.     Assign Rewards( Extrinsic Motivators)
Decide rewards and design badges (For completing levels, trustworthiness) for contributors considering the completing levels, the trustworthiness of the contributor, etc.

7.     Incorporate Intrinsic Motivators
Identify features to incorporate for those who contribute to building better cyberspace.

8.     Rewarding Contributors/Gaming Experience
The platform would give a gaming experience for contributors to retain in the cause. Intrinsic and extrinsic motivators will be embedded in the gaming experience.

9.     Model the trustworthiness of a contributor
A mechanism is to be implemented to check the trustworthiness of a particular contributor and to assign a badge for trustworthiness. A higher weightage of validity to the response from a trustworthiness badge owned contributor will be considered in assessing the quality of response.

10.  Aggregation of contributions
Evaluates the results-based on human-judgment for each post store the results, ignored, hate speech or not, category of the hate nature, etc to aggregate the contributions and to generate advanced questions at a later stage.

11.  Suggest Preventive measures
Provide suggestions after analyzing the post to remove the post, to remove only a part of the post, etc. Use the adaptability capability of UX as a preventive measure in social media communication

12.  Accessibility of peer contribution
Implement a mechanism to regenerate questions based on how others responded without revealing the other contributors.

E.g. Sample question
12 contributors or 20% of the contributors have said this post contains hate speech. Do you agree with this statement?


Sunday, December 13, 2020

An Analytical Framework for Social Media Content Moderation using Crowdsourcing



With the rapid growth of social media use, the numbers of user-generated posts are growing exponentially. Social media platforms find it challenging to moderate all these posts before reaching to a wider range of audience as these posts are written using multiple languages and using different forms of multimedia. Social media platforms find it difficult to detect hate speech in social media content for local languages such as Sinhala or Singlish as contextual, linguistic expertise, social and cultural insights are required for accurate hate speech identification. 

Research is being carried out in detecting hate speech in social media data in English with the help of crowdsourcing platforms. But still, it is required further research for local languages. Following this necessity, in this research, we propose a suitable crowdsourcing approach to moderate hate speech from social media content. For this, it is proposed to implement a crowdsourcing platform with mechanisms to pre-select contributors, rewarding, contributor reputation management, analytical capabilities, and moderate hate speech content. With the use of a well-implemented crowdsourcing platform, it will be possible to find more nuanced patterns with the use of human judgment and filtering and to take preventive measures to create a better cyberspace.





Friday, December 11, 2020

Deciding the technologies, Platforms and frameworks to use in developing the framework

Technology is evolving rapidly and as a result, it is crucial to decide the tools to use in developing the apps. In my research, I have to develop a web site, a web application, a mobile application, and a Facebook application. The first challenge that I faced was if any platform that I can use to build all of these using a single technology. I had a couple of options.

First is to use PHP bootstrap in developing a responsive web, so that instead of developing an app user can use the web.

Second I did not have experience in developing Facebook apps and I didn't know the limitations a FB app has.

Third I was suggested to use React or Angular Frameworks in which I have never worked with before.

Fourth initially start with developing an Android Application that I am familiar with the process and then to move on any other platform.

Fifth whether to use versioning control in my project or not.

I am still pondering the choices. But for the time being, I am setting up the computer to develop Android Applications using Android Studio.


Friday, December 4, 2020

Introduction to my research and key areas


       
            I decided to share the new things that I learn while doing my research on this blog. Before that, it is essential to give you an idea about a few key areas that I will be writing about. As the blog title says I will be blogging on

·       Crowdsourcing
·       Social Media and Social Media Content Moderation
·       Use of technology in developing different tools
·       
            First, we will look at the key terms or key areas that are involved in my research. I will be using certain contents from my research proposal here.

            Crowdsourcing

            Crowdsourcing is a growing effective way for organizations to gather the best ideas from online communities and use them in ways that benefit both the organization and contributor. Current crowdsourcing campaigns always use social media to obtain a higher number of contributions, in theory leading to different ideas.

            Social Media

            Social media has become a significant force in sharing information, forming opinions and attitudes, etc [1]. Facebook is the biggest social network worldwide with 2.41 billion monthly active users as of the second quarter of 2019. According to the newsroom of Facebook, more than 2.1 billion people use Facebook, Instagram, WhatsApp, or Messenger every day on average [2].

           Social Media Content Moderation

            The credibility of the information shared in social media is doubtful and how to evaluate the information credibility on the social media platform has become an important issue for today's information consumers [3]. At the same time, the information shared can be offensive and could lead to creating social issues. Social media research includes the analysis of citizen’s voices on a wide range of topics. Some of these topics would have a direct impact on creating social issues. Therefore by incorporating a crowdsourcing approach in Social Media, it is possible to reach a much larger audience and capture the user opinions and behaviors. In social computing research, social media has been shown to provide a unique window into the social experience of people [4].
           
           Use of Technology in Developing  Tools

          Similar to social computing research done given in[4], in our research we focus on identifying a mechanism with the use of direct and indirect crowdsourcing to moderate the social media content by Sri Lankan citizens on various topics in Social Media. A crowdsourcing platform will be implemented to identify and classify the posts if they are related to social issues. Furthermore, the crowdsourcing platform will be used to verify the identified posts identified by the Natural Language Processing(NLP) and Machine Learning(ML) tools developed by my fellow research colleagues.



[1]  Voramontri, Duangruthai & Klieb, Leslie. (2018). “Impact of Social Media on Consumer Behaviour”. International Journal of Information and Decision Sciences. 11. 10.1504/IJIDS.2019.10014191.

[2]  newsroom.fb.com, ‘Stats’, 2019. [Online]. Available: https://newsroom.fb.com/company-info/. [Accessed: 06- Aug- 2019].

[3]  R. Li and A. Suh, “Factors Influencing Information credibility on Social Media Platforms: Evidence from Facebook Pages,” Procedia Computer Science, vol. 72, pp. 314–328, 2015.

[4]  Tim Finin, Will Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze. 2010. Annotating named entities in Twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk(CSLDAMT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 80-88



Workshop for Annotators

Let's say that you are trying to train a model using labelled data. If your model to give accurate results the training dataset should b...