Machine Learning by Communities, for Communities
When was the last time you thought about that blank text field where members of your community can leave comments? That text field and blinking cursor are the closest we have to pauses between human interaction on the internet. In this episode, Perspective’s product manager, Cj Adams, encourages us to think about how we might innovate that text field and blinking cursor in hopes of having more inclusive, difficult, and natural conversations.
Cj also explains how Perspective can help. Its API has a variety of ways that can be implemented, all with the goal of perceiving the impact a comment might have on a conversation. But Cj also explains that machine learning is is not flawless, and he reminds us that the humans responsible for training it are what encourages its actual biases. So, just like with any other tool that you consider for your community, think about how you can implement it with your community in mind and not as the be all, end all solution for creating better conversations.
Cj also shares:
- How Perspective creates a conversation around moderation
- Why Perspective is a tool for communities small and large
- What machine learning does when it’s “really stupid”
Big Quotes
Using machine learning to encourage community-minded conversations: “People can be a little too focused on how good is the machine learning, or what is the exact technology behind it, [but] a lot of it comes down to what’s the actual user experience? What does that feel like to type something in? That idea of just giving someone a moment when they submit [a comment] to give them some feedback instead of moderating after the fact is a really powerful thing. … It’s a little bit more like when you’re in a human conversation.” –@adamscj
The bias in teaching machines: “There are three main types of machine learning. Supervised learning, unsupervised learning, and then reinforcement learning. What we are talking about here is supervised learning, where you put in a bunch of training data and machine learning is trying to identify patterns in that data and then be able to, in this case, classify new examples to say what this does or doesn’t look like. One of the things that you have to recognize is that it’s going to be as smart or as dumb as what it learns from. … Machine learning is only as right as whatever the data was that it was trained on.” –@adamscj
Addressing the challenge of an empty text box and a blinking cursor: “In a lot of forums, we have an empty white box with a blinking cursor, and that’s how we talk to other humans. … Is there anything we could do here that’s a little more creative, a structure that might facilitate less toxicity, that might facilitate people to understand the humanity of the people they’re talking to? I don’t know what that answer is there. I am excited by trying to figure out how we might nudge people in a direction towards understanding people they disagree with, listening to people, learning and sharing their views in a way that fosters understanding and fosters people being able to keep talking even when they disagree.” –@adamscj
About Cj Adams
Cj Adams is a product manager at Jigsaw, part of Alphabet (Google), focused on building technology to make people safer around the world. Since 2015 he has been the product manager of Perspective, an API that helps communities and platforms use machine learning to protect voices in conversation.
Previously he led Project Shield, a free DDoS mitigation service for news organizations and before Google, he helped build a national confidential hotline for victims of human trafficking at a non-profit called Polaris.
Related Links
- Cj Adams on Twitter
- Jigsaw
- Perspective
- Project Shield, a free DDOS mitigation service from Google
- Polaris, a hotline for victims of human trafficking
- Bassey Etim, editorial director at Canopy
- Andrew Losowsky, project lead of the Coral Project
- Greg Barber, co-founder of the Coral Project
- Project Respect, a machine learning training program by Google
- Twitter sabotages Tay Bot, Microsoft’s AI chatbot
- The False Positive
- Change My View on Reddit
Transcript
[00:00:05] Announcer: You’re listening to Community Signal. The podcast for online community professionals. Tweet with @CommunitySignal as you listen. Here’s your host, Patrick O’Keefe.
[00:00:24] Patrick O’Keefe: Hello and welcome to Community Signal. Jigsaw, from Google’s parent Alphabet, is building tools aimed at empowering better conversations online. On this episode, we’re talking with product manager Cj Adams about the limits of machine learning, their Perspective API, and how smaller communities can make use of these tools.
Thank you to our loyal supporters on Patreon, including Marjorie Anderson, Katherine Mancuso and Carol Benovic-Bradley. If you’d like to join them, please visit communitysignal.com/innercircle.
Cj Adams is a product manager at Jigsaw, part of Alphabet and Google focused on building technology to make people safer around the world. Since 2015, he has been the product manager of Perspective, an API that helps communities and platforms use machine learning to protect voices in conversation.
Previously he lead Project Shield, a free DDOS mitigation service for news organizations. Before Google, he helped build a national confidential hotline for victims of human trafficking at a non-profit called Polaris. Cj, welcome to the show.
[00:01:21] Cj Adams: Thanks so much for having me. I’m happy to be here.
[00:01:23] Patrick O’Keefe: It is great to have you on. How long have you been working on Perspective API?
[00:01:30] Cj Adams: I think the beginnings of the project started in probably 2015, so maybe three and a half years, four years almost. That started with a research project we called Conversation AI, trying to look at how machine learning might be able to improve and help people have good conversations online.
[00:01:50] Patrick O’Keefe: I thought it had been at least a few years, because I’ve been familiar with it for a while. I think I may have first heard of it through a friend of mine, Bassey Etim, who recently left the New York Times and has been on the show a couple of times. I want to say that I heard about it from him a long time before there was a project launched there. I remember going to the website for Perspective API. I saw the textbox where you could enter text and have it rated by toxicity.
When I looked at that and I read reaction to it, it almost felt like that textbox gamified [chuckles] the project. It’s like people visit and they enter a sentence or two devoid of greater context and they receive an answer, like it’s a quiz on their favorite quiz site. What level of toxicity are you? [laughs] I was thinking about the seriousness of the effort and that textbox and how people have used it. Do you feel like the textbox has given people the wrong idea?
[00:02:50] Cj Adams: I think that’s a fair criticism, in the sense that when we met –– Bassey is fantastic and what New York Times is doing is really incredible, of trying to facilitate scaled discussions on really important topics. We’re looking at trying to figure out ways that we could help facilitate people who are running communities to do it faster and have the ML be an assistant to help them move through that task faster.
We also didn’t want to limit the use of Perspective or ML to what we could think of. We wanted to make something that anyone could use and have that power of ML to facilitate their own community goals. That didn’t mean just moderation. The textbox that you saw was imagining of what if a comment box could give a feedback as you type into it as to whether or not it might align with your community standards or not.
It was this idea of like, it’s not just about moderation. Maybe we could go into authorship, giving authors feedback, or a viewership. Instead of moderating, you just let people say the mean stuff or let them skip it and give that control. The website was a way of exploring these less traditional, challenging what’s possible if you have the ML what ML might enable. That’s what those two experiments were.
In the end, I think it did lead to some misunderstanding of what Perspective is, but some very cool things have come out of it. Concretely on the product side, Coral Project has a very cool implementation of Perspective that authors get feedback. When they type in and click submit that says, “Hey. This might violate our community guidelines. Please take another look.” If it’s wrong we’ll say, “Great. We’ll have a human review it. No problem.” Have that second to review.
Publishers that have used that have found that it’s really decreased the number of comments that they need to moderate or any toxicity that comes through, because people take that moment to edit and think about the community for the moment. I do think that sometimes people see that text box and think that is Perspective. I think it’s a fair criticism that while it’s the most visible version of what it is, Perspective is an API, and you can do whatever you want with it in terms of building products and services to support communities.
[00:04:50] Patrick O’Keefe: For those listening, we had Andrew Losowsky from the Coral Project on a previous episode of this show, as well as Greg Barber from the Washington Post who has worked on Coral. He might still be working on it. I don’t know. I know they were just acquired by Vox. You may be familiar with it. If not, check those episodes.
It’s funny to hear the preemptive strike for moderation because there’s a really basic, basic thing I’ve done in my personal communities for probably 12 years, where the word censor feature on community software has pretty much remain unchanged forever. You have a list of bad words. It will put a little asterisk or replace it or whatever. I still run an old version of phpBB. I had someone write up a script that if you hit that censor list you were told before the post went anywhere that you hit it, here’s what it was.
It would highlight what was wrong. “Here’s your post below, edit it, submit it.” I’ve always talked about word censor feature because they all have it and it never changes. There are communities where that could be of use. Certain communities may not be receptive to that preemptive feedback and maybe they would just be a way to get around to the censor. For most communities, we forget that most communities in the world are fairly small.
We think about Facebook or YouTube comments or whatever, but there are a lot of communities out there that are say, I don’t know, let’s say 1,000 uniques as month or less. That’s probably where the majority of the online communities. Even smaller, 25,000 a month. They don’t have the same problems, so those little tweaks to help their communities where people actually want to participate and be a part of something greater and value what the community is.
It’s always funny because I feel like those features are often neglected. I wrote a blog post about that feature literally a decade ago and said, “Here, take it, use it, do other things with it. It’s an idea. You don’t own it. Just please, please put it in things. Use it.” Because I want the administrative side to get better, but the administrative side often is lacking in form software and Community software, so I think that’s a really interesting thing.
[00:06:56] Cj Adams: It’s cool that you were doing that same thinking a long time ago.
[00:06:58] Patrick O’Keefe: No Is or Ls, though. No intelligence or learning, just Ds, dumb.
[00:07:03] Cj Adams: I think that sometimes people can be a little too focused on how good is the ML, or what is the exact technology behind it, when a lot of it comes down to what’s the actual user experience of entering something. What does that feel like to type something in? That idea of just giving someone a moment when they submit to give them some feedback instead of moderating after the fact is a really powerful thing.
I think that it’s a little bit more like when you’re in a human conversation. If I’m talking to you and I just suddenly shout at you and yell and call you a bad name, you lean back and open up your eyes and it’s, “Whoa, what was that?” It’s can you bring a little bit of that humanity to this comment box. Can you bring a little bit of that feedback loop to someone that gives them the benefit of the doubt, and I think that’s a powerful thing.
I think that the places that we’ve seen that authorship be really successful in the wild too, something that’s interesting is toxicity as a term was very effective in terms of being a somewhat nebulous term that lots of people when doing data annotation can agree on. If you ask is this offensive to you, they might be saying that the topic is offensive or the opinion is offensive.
If you talk about toxicity, people generally agree with you when you’re talking about how something is said, and you get really good agreement because it’s a charged word but it’s also somewhat nebulous. It works really well to annotate things and build machine learning models. At the same time, we found that using the word toxicity in a UX is terrible. That box that we have on the demo site, we’re trying to tell people what Perspectitve’s doing so it uses that word.
Using that in a UX is not good. Things like what Coral Project or others have done is use the model to then surface a message to users that really gives them the benefit of the doubt. Which you should be doing and saying, “Hey, before posting this, check out our community guidelines.” If you think that it still follows them, great. If not, take a moment and just see if there’s a way that you can phrase this in a way that that contributes to the community as more productive. I think that’s a really powerful moment to take that space in a conversation, just give someone that second chance and give them the benefit of the doubt.
[00:09:03] Patrick O’Keefe: What I found is that when we did that little, tiny censor block thing, removals by moderators for profanity went down 90+%. That’s my rough estimate and it’s a small community like most communities are. We just didn’t remove posts for that reason anymore because the audience that we had adjusted their content. The way I would moderate and still moderate now is self-censorship doesn’t make the word okay. I have that as a standard.
The F word’s bad, f*ck is bad, too. That’s not okay. You would get people who would do that and we’d have to remove it, or even the censor feature would kick in throw an asterisk in there we’d have to remove it either way. Moderators love it and I love it because we don’t have to spend time on that pretty simple application. Yes, people post other words, yes, there’s other languages. There’s certain words that are part of other words that make it tough like A-S-S, assistant. There’s lots of different things that crop in, but for the most part we don’t want to remove post for that. Users like it because their post isn’t removed after they made it and they have an opportunity to fix it. There’s not a negative experience there. Like obviously you can wedge automation into things and create a negative experience sometimes, but not the case there’s really like little to no negative that we’ve seen at all, and is a case where moderation can be automated.
[00:10:22] Cj Adams: Yes. I think that giving the person the chance to rephrase too and say in a way that does follow the community standards is a great opportunity. Like you said, there’s not this feeling like, “Oh, the thing I had to say is no longer there.” It’s just asking to say, “Hey, consider these community guidelines, so that you can say in a way that can stay on the site.”
I think it’s also important to note that there is not one standard of good words or bad words, and the bad word lists too, it’s like context is everything. I think that with perspective people can set whatever threshold they want. Usually, people set a very high threshold for doing these kinds of notifications, like above a 0.9 or something like that on our toxicity score. Also we have a set of subtypes of toxicity that are less known, but you can take scores for different attributes.
Like likeliness to be sexually explicit or contain profanity or be a threat. You can say, for my community, I don’t really care about profanity, that’s fine. You can weigh those different subtype models differently to try and build that custom mix that really matches your community’s guidelines. The thing that I’m excited about is, can we create ML that’s really by the community for the community, that matches their standards their values? Can we give it to communities in an API, so that they can really surface it however they want to?
Both of the models themselves and how those are used and leveraged can be done by them. Sometimes that’s authorship feedback, sometimes people just want to sort the things in their review queue, other people just want to leave everything on there and just give people an option to hide or show some of the stuff that’s toxic, and I really like that kind of diversity of solutions, because it lets people find what’s right for them what’s right for their community.
[00:12:06] Patrick O’Keefe: Let’s talk about that. You said in our preshow questionnaire that the goal or one of your goals is to build and share access to machine learning tech in a way that it remains a tool “by communities for communities”. The phrase you just used, use and can be easily and transparently customized for each community’s unique needs. What does that look like? How can a community with little or no budget, maybe a one person community with some volunteer, it’s fairly common? How easy is it for them to customize something like this right now and where would you like it to get to?
[00:12:36] Cj Adams: The customization of how it’s applied, is today very easy. You hand in a bit of text and you can ask for whichever model you’re interested in, there’s about 10 public models and you get back a score between zero and one. That score doesn’t represent the severity of toxicity, it represents the similarity to other things that are toxic. It’s a probability score that we think this is going to be similar to things that other people have said is toxic or is an attack or whatever model you’re requesting.
You hand in a bit of text and get back a score, and that’s free and anyone can get an API key and just send up to 10 queries a second is free. You can hand in that bit of text and get back a score, and then you can do whatever you want with that. If you have technical skills yourself you can program that into whatever experience you want. If you don’t have an engineering background or don’t have interest in building your own solution, there’s a lot of different existing widgets or off the shelf tools that you can use that have built-in support for perspective.
Discourse has a plugin and they do a cool thing that you can give a little bit of feedback notice, but they also let you set– Maybe you only wanted to apply for people for the first month of being in a community, for example, they have some cool features around that. Coral project we discussed, they have something where you can set the threshold, set the model, and then you can either just have it sort and flag things by their toxicity score on the backend, or you can also enable this authorship feedback on the front end.
Maybe half a dozen major platforms have built these plugins that have the disadvantages, they’ve already decided you know what it’s used for either as a flag for moderators or the authorship feedback. The plus side is that you can just take it and install it directly yourself and use that widget as it is. There’s also Disqus as a platform, they have a feature that’s just on the moderation view you can sort by toxic. They use a very high threshold. I think it’s well above 0.9 and they have a high threshold to let you just grab those if you’re trying to view a lot of things in Disqus, you can filter by toxicity. If you’re a small platform, small community, you can use any of those off the shelf solutions.
[00:14:44] Patrick O’Keefe: The off the shelf models are based on conversations that you viewed and scored from other sources, like I believe that Wikipedia talk pages are one. I don’t know if those are fed into a model, I just know that’s one area you’ve looked at, right?
[00:14:57] Cj Adams: Yes, they’re from comments, millions of comments from across the internet from a lot of different sources and those go into the general models. What I was describing is the different ways that you can customize how those scores are surfaced. Now, what about customizing the models themselves? That’s the really cool exciting space that we’re exploring now. Today the best way to customize is that we have a set of what we call, subtypes.
These are like types of toxicity. Toxicity we define as a comment that’s rude, disrespectful, or otherwise likely to make you leave a conversation. Different people leave for different reasons, and different communities have different ideas of what is toxic. While that model is good is like a general sorting model, or it fits a lot of different community standards if use a high threshold, it doesn’t fit everyone.
A good example of this is, if you look at sports communities, for example, they might really like talking about crushing and killing the other team and destroying them. Some of that language might trigger some of the aspects of toxicity that sound like threats. That’s what they want. They want to be yelling and shouting and threatening the other team in this way. For a use case like that, we have a set of models. They’re currently experimental, so they’re not in production but they break out these sub-attributes of toxicity.
Like I mentioned, sexually explicit, obscene, threats, personal attacks, and you can combine those different ones with a weighting that really matches your community standards. We’ll be releasing more like tools that help you build those custom mixes. Right now the only one I think we have out that’s open source bot for Reddit that you can kind of mix and match the different weights of different models, and then have it flag and notify you as a moderator with these different weights.
But long term where we want to get is something that’s even easier. Instead of you having to do any of that waiting on your own, the long term dream is to be able to say, here’s a bunch of things that we in our community consider toxic and you can upload that data, and then we could create that custom mix for you and then serve that directly to you, that custom mix for you. That is sort of our long term ideal but that doesn’t yet exist.
[00:17:00] Patrick O’Keefe: Very cool. [chuckles] Yes, exactly where I was going. What I was going to say also was that– you touched on this in the end, but is the thought that it could get to a point where say, as part of these add-ons for popular community software, the people at these communities who run them could spend three months, six months, viewing each piece of content and marking it as either okay or okay.
It’s some button that is added in their moderation console, let’s say, and then that would be the model that they could then deploy. It sounds like that. You just basically said that they could take the text dump it and uploaded in the future that’s where you hope to get to. That’s basically the same thing. That’s the way that it can make sense for a smaller community to do that, is to really look at it in line and put in a little time and receive their model.
[00:17:44] Cj Adams: You can actually start collecting that right away. The API has two methods. Analyzed comment, where you hand in the text and you get the score back, and suggest comment score is the other method. That is you can hand in a bit of text and say, “I would label this as toxic.” You can give that feedback now. A lot of communities do give their feedback and then we can take that– it’s a manual process right now, but we can do that and that’s what we can use to experiment with building these customized models.
It fits into that feedback in, you can also help correct mistakes. It makes lots of mistakes and when those failures go in maybe it’s a type of language, or a type of insult that’s never been seen before, or maybe it’s a false positive. Something that the ML has associated with toxicity that never should have by sending those feedback loops, and you can make the model better not just for yourself, but for anyone using those global models.
That same method will be helpful in the future for being able to help people create custom models for their communities. One caveat there though is that this notion of, you have saying well you would accept or reject, then it’s going to learn that. Perspective is another value that we have and have had from the beginning is that, we wanted this to be about how things are discussed not what’s discussed. It’s this value of topic neutrality.
In that degree, like it, we’ll be able to be customized to what your community defines as its guidelines, what its definition of toxicity is. Because the only ingredients it’s mixing from are ingredients– There’s a set of ingredients you can make your own cocktail of it, but the ingredients are these subtypes of toxicity. It’s things like threats and attacks, and things that are common community guidelines ingredients.
If you say moderated out every time that someone said, I don’t like the show, or something like that, that’s not going to wait for those different ingredients. There there’s nothing in there. It’s not going to be able to pick up that moderation of a topic for example. Which we think is a feature as how we’ve designed the ML from these values first principle based development.
Where we really wanted to say, we want to build a set of tools that people can use to customize to their community guidelines, but we also don’t want to build something that anyone could customize to be able to score arbitrary checks. There’s lots of another tech that could do that but Perspective is not about that. Perspective is about helping people protect voices in conversation, helping people have these impossibly difficult conversations online at scale.
[00:19:58] Patrick O’Keefe: I’ve banned political conversations in my communities for many years. General politics, which is really common in communities that are not about politics because you become consumed by them. What you are saying is that it’s not going to help me block out political topics. Just the cadence that which those topics are discussed.
Cool. Okay.
You touch on this a little bit, but when I think about a new tool in a community: a tool, a concept or an idea and the impact it could have on the community space. I think about whether or not it would touch the vast majority of online communities. Again, we are talking about the small and no budget folks as much as the big people. There are communities that offer applications that cost you five figures, six figures, Lithiums of the world, but things they do don’t really impact the space very much.
At least not for many years, if they come up with something. I’m talking about you mentioned Discourse, that’s one. The phpBB, the Discourse, even the cheaper commercial options, the XenForo, the Invision Community. That sort of end of the spectrum. I know there’s a Discourse plugin right about that. Is it part of your effort? Do you dedicate any resources to finding– “We want people to use Perspective. These are the applications that are the most dominant.”
If you were going to see and that’s a blog right now, you look at a WordPress. Maybe it just might be a WordPress blog, I wouldn’t be surprised. Then that’s another application. Community software-wise, do you look for the most powerful applications and then dedicate a little bit of time and resources to try to bring that audience this tool, or do you simply rely on the developers on that community to make it happen?
[00:21:27] Cj Adams: We’ve taken the approach of trying to make this as easy as we can for developers. That way people can make what they want. I think one of the things that I’m conscious of is, we want to make this something that people can customize the way they want. I’ll be cautious of saying, “Okay, well, I think this is the way that it makes sense to you as Perspective on X platform.” Well, I’m not a user of that platform.
Like I want someone who uses that platform to say what the best way to use it is. We found that that kind of organic growth is where we’ve really seen success. Where people have been really happy is that, if the developer in that community who knows that toolset really well and want to build something that really fits for users of that. That’s just where we’ve seen successful and if we say okay, this is the integration I feel like it’s less organic and ultimately less successful.
If people have a really passionate developer in one of these communities and they want a little bit of support by all means, please reach out to us. You can go on our GitHub page and contact us directly. We would love to answer questions, and help someone. If someone was really passionate about just Drupal and the comments on Drupal, and they’ve reached out and we help them set up a cool plugin there. If you are in that position or someone is in that position, please reach out on GitHub. We’d love to help out however we can, because we are excited about that. We don’t ourselves build those tools because we want them to come from the community. We found that they are always better when they do.
[00:22:48] Patrick O’Keefe: When I hear that, what I hear is that with perspective API, it seems like you are building something that you see sort of a platform. A platform for others to build on. If I’m reading that right, it may not be, where does that platform concept go from here? Do you see these are the apps that work for Perspective, what is sort of the platform idea, if I’m reading it right, and how do you plan to scale it up?
[00:23:10] Cj Adams: It’s important to think about where Perspective came from, too. It came out of research. Out of this research group called Conversation AI. That started in a team mate Alphabet called Jigsaw. We just kept the funny corner of Google and Alphabet world.
[00:23:25] Patrick O’Keefe: I’m always careful to say it’s Alphabet’s Jigsaw because [laughs] there’s like a hierarchy thing. I’ve heard like the stories of it, about how, for example, “This is Alphabet’s Jigsaw.” “Oh, that is Google’s parent, Alphabet’s Jigsaw.” It’s a funny thing to introduce.
[00:23:36] Cj: [laughs] Some of these places get big and you get attracted to different pieces. Jigsaw is the team that reports into Alphabet and it’s a really cool space that’s dedicated to understanding ways that technology can make people safer around the world. It works on issues like stopping DDOS attacks, and trying to help people that have greater security from state sponsored attacks on their Gmail accounts.
It also works on issues like trying to help people understand and stop phishing attacks. Trying to look at, what are the emerging threats, what are threatening voices around the world. One of the things we saw was this threat or harassment and abuse in conversations was silencing voices. It was keeping people from being heard. That started this research effort. Perspective came out of that research effort that started in Jigsaw.
In a way, any API is a platform. It’s something that everyone is using. Perspective’s grown incredibly and I think it has been really useful to a lot of people. Ultimately, we judge our impact based on how much have we improved the participation, the quality and the empathy of conversations online. The reason those are our metrics is, we are trying to look it like are we actually protecting voices in conversations.
When we launch Perspective we have the product and there is a strategy there. Like I said, we want to work into more languages. We are going to be working on customization features and things like that. There is also the research part of the work. The research side we release a lot of our data as much as we can. Any creative comments data source that we tag, we put it out there.
We release papers on our annotations schemes. All of the UX research we do. We just try and publish it because also think that there’s impact there. We don’t have any revenue mandate in Jigsaw to make money. We really have the mandate to say, “Prove that you have made people safer with technology. Prove that you’ve improved conversations. Prove that you’ve protected voices that wouldn’t otherwise wouldn’t have been heard.”
We are both pushing out these resources via the API, but we are also just giving as much as we can via the research so that anyone else can replicate the work. Whether they want to build their own version of Perspective or build their own model, we are very happy and excited when we see that kind of thing happen.
An example of that is, we released a competition a while back where we gave a bunch of data and a bunch of annotated data. We had I think 4500 different ML teams all compete on trying to build something similar to Perspective so that anyone could be able to so this. We have another one coming up that we are looking forward to that’s going to also include findings of our newest research. And invite other people into that data and into that process that we think is really important for building things like Perspective.
[00:26:11] Patrick O’Keefe: This tech Perspective API where do you think it could be applied that we are not talking about?
[00:26:17] Cj Adams: It’s a great question. I think I’m very excited about the newer and more creative applications of Perspective that I have seen. There’s a lot of utility in flagging things for human moderators. These are machine-assisted human moderation and sorting for moderators. That’s been done a lot and it works, so more people should do it and we should hear more platforms and more languages.
One of the places that I’m excited is just some of these really creative approaches that are happening, some examples are– a couple of applicants we’ve seen have done things around community-driven moderations. Can you use things flagged by perspective to then surface to community members and have them decide whether or not it does or doesn’t violate the guidelines. Actually, that’s the community themselves make that decision.
Maybe you say okay, we have to get four votes before it gets moderated but really literally empowering not just the community moderator, but empowering every person as far as the community gives them that ownership. I think that’s a really cool model that I’d like to see more of. Every time applicants apply for API keys, there’s just really exciting stuff. We had things that I would never have thought of. There was an applicant that was just someone trying to facilitate a community forum in an old folk’s home.
Apparently, it just got real mean sometimes. They were looking with the help with that. It’s like, “I never would have thought of that application.” Someone else was working on a tool to help people access some spectrum, be able to learn about how to communicate well and practice things and understand when they might be saying something that would make someone leave the conversation. It’s like really creative applications of perspective that I’d never would have thought of, and are really exciting to see when they are coming in the door.
[00:27:54] Patrick O’Keefe: You told me before this show that, “People in the tech sector talk a lot about the ways that machine learning is smart and transforming, but sometimes we don’t talk enough about the ways it can be really stupid.” What does machine learning do when it’s really stupid?
[00:28:07] Cj Adams: There’s three main types of machine learning. Supervised learning, unsupervised learning, and then reinforcement learning. What we are talking about here is supervised learning. Where you are putting in a bunch of training data machine learning is trying to identify patterns in that data and then be able to, in this case, classify new examples to say what this does or doesn’t look like. One of the things that you have to recognize is that it’s going to be as smart or as dumb as what it learns from.
I think there is trust sometimes if you put “Oh, the machine is doing this so it’s right, it’s objective.” That couldn’t be further from the truth in the sense of the machine is only as right as whatever the data was that it was trained on. When we started this we did a bunch of analysis to try and understand, was there bias in the raters that we had. Did men versus women rate toxicity different, for example. Or did people in different locations, geographically rate anxiously.
We didn’t find major differences and that which is one of the reasons we moved forward with the model and list it. There was a huge blind spot, which was we were only training on the conversations that already exist. That seems a bit odd, but there’s this huge negative space of all the people who are not currently speaking online, or are only speaking in closed private communities.
By training on the data that we had, we realized that particularly in groups like frequently targeted groups the names of those identity groups would often be associated with toxicity.
That’s a huge problem. If you look at, fundamentally we realize we can not build the ML for the conversations we want if we are only training on the conversations we already have.
Just as a small example, we write about this on the False Positive blog that you can reshare the research as you go, but comments with the word gay for example. There was a vanishingly small number of comments at our initial training datasets that use that use that term in a positive way. If you just look at the internet, generally, of how people use that, it’s used as an insult, it’s used to insult people and attack people. In a lot of forums, it’s not a very safe place to say, “I’m a proud gay father.”
There weren’t enough examples in our training set of using that term in a positive way. The ML just looking for these patterns, every time it was seen that word, it was tied to some toxic insult or whatever, so it learned that association. We’ve gone through five or six iterations now of doing biased mitigation, as we go publishing that research, publishing datasets, so that other people in text classification don’t make similar mistakes.
That’s just one example, and I think a very important one, of how ML will just learn the patterns that are in the data you have, and when you’re trying to create something different than what you have, it’s going to learn the problems that you already have in the world. It can make those mistakes, and you’ve got to figure ways to use ML as a tool and really understand its weaknesses as a tool. You can read a lot more than about what we’ve done to improve that and mitigate that on our blogs if you’re interested.
[00:31:05] Patrick O’Keefe: How do you find that blind spot? Is it people flagging it for you, and bringing it up saying, “Hey, I noticed that this term is toxic. That was one of my main key exposure points with the aforementioned text box.” On the website, this shows up weird? Is it other research that draws you in to the specific groups and pushes you to reach out to communities dedicated to those groups to import their discussions that they’ll share with you to see what they see as toxic and non-toxic, to weight it? How do you figure out the biggest blind spots to tackle first?
[00:31:36] Cj Adams: All of these things. We have partnerships with advocacy groups like GLAAD, for example, using this example specifically to try and work on collecting examples of failures. We also have built custom experiences. A team at Google built something called Project Respect that you can check out, and it is something where you can type in a bunch of identity terms, and then it’ll take whichever– It was like, “I am a cook and a father and I’m gay.” If you said those three things, it would say, “Which one of these has the highest isolated toxicity score?” Then ask you to enter five or 10 different sentences using that in a loving way. You’re kind of collecting data that way. We also collect it directly from users. I think, like I said, as people are using it, if they disagree with the model, it can send that back to Perspective, and tell us when there’s that error.
In a way, the text box is kind of the hardest challenge, because most of the production use cases have a threshold of only calling something toxic when it’s above, say, 0.85 or 0.95. Whereas we put up the text box, knowing that people would be typing in anything and being really critical of whatever that number landed.
We chose to do that really intentionally for two reasons, and it was a surprise benefit that happened. Like I said, we wanted to show new use cases, but we also really wanted to be committed to transparency of showing exactly what this model is scoring. We wanted people to be able to see the ways it succeeded and the ways it failed. Separately, we also planned– We thought that a lot of people would throw a bunch of bad content into that, [laughs] and that happened in droves.
Shortly after we launched, everyone on 4chan and people said, “Oh, my goodness, let’s do what we did with the Tay Bot. Let’s train this the wrong way.” They all came in droves and said the most horrific things into that box, and tried to give it the wrong answers. We weren’t actually taking those answers and automatically training, we were sending those to humans. We just said, “Thank you for that drove of abuse that we can now help to improve.”
The third benefit that we didn’t really expect was fantastic, and I think came out of the transparency, which was, by having that there, every time someone typed in something, and saw a result that they thought was wrong or bad or a mistake, they would blog about ir or share it, and everyone that saw that, would come and try and find a mistake by typing in what matters to them.
They would type in their own sentences, their own work, and through that transparency, it shows the ML’s faults. It’s public, in a way, but by having that transparency, everyone is able to type in and show those failures and it motivates increased energy towards fixing them. When there was some press cycles around that, I thought, this is transparency working. This is what it’s about. It’s about showing the models directly as they are, letting people provide criticism, and that gives the information and energy to be able to improve them. I think that we’ve been able to do that and I’m really proud of it, and that box still will find new things that people will come and say, “Hey, you’re missing this, you’re missing that.” I think it’s really a fantastic point to learn. In-product feedback, the text box itself, as well some of these dedicated partnership and experiences we’ve built with other people to concretely directly collect bias mitigation data.
[00:34:42] Patrick O’Keefe: Another thing you mentioned to me that you see is, the need for technology companies to help address some of the more subtle forms of toxicity in ways that new structures of discussions might be able to increase the participation quality and empathy, so that people aren’t silenced. Those are your words. I was curious about the structures of discussions part of that. When you say structures of discussions, what do you mean?
[00:35:05] Cj Adams: In a lot of forums, we have like an empty white box with a blinking cursor, and that’s how we talk to other humans. It’s like, “Is there anything we could do here that’s a little more creative, a structure that might facilitate less toxicity, that might facilitate people to understand the humanity of the people they’re talking to?” I don’t know what that answer is there. I am excited by trying to figure out how we might nudge people in a direction towards understanding people they disagree with, listening to people. Learning and sharing their views in a way that fosters understanding and fosters people being able to keep talking even when they disagree.
I think that why I say the more subtle forms of toxicity is that there’s very toxic things– There’s rules, you have community rules, and then there’s actions that you can take on those rules. There’s kind of a clear, “Okay, if this is really toxic, what do we do?” There’s a clear response, but then there’s a long tail of stuff that’s not quite violating a rule, but maybe there’s a way that someone could say it in a way that would encourage a better community and a better discussion.
I think that moderation is not an option there, even stopping someone and saying, “Hey, you can’t do this”, isn’t an option. What are the different structures that allow someone to understand the impact that their language is having on the conversation, and nudge them in a direction that’s less toxic and in a direction that is more constructive for the community. I think that that’s really interesting. One of the places I’ve seen that concretely is, really I’m a big fan of Change my View on Reddit. Are you familiar with them?
[00:36:44] Patrick O’Keefe: Only vaguely.
[00:36:45] Cj Adams: Change my View was started by a man named Kal, who was just sort of looking at the fact that there are not many places that someone can go to have their view changed, and actually try and learn from a position that’s not the same as their own. They set up just a subreddit, it’s grown to maybe like half a million or so subscribers, but they have a very structured moderation and setup.
Some things like there’s a rule that if you post and you don’t reply within three hours, the posting is removed. One of the structures is just enforcing people be there, and they want people to be there. They also built around a structure of rewarding and having people work towards what they call a delta. Which is when someone’s mind change in some way, just some point or something that you understood in a little bit more detail, did you learn something?
Those are pretty small structure changes, but I think they’re really interesting because it aligns incentives and actually changes what is on the top for the posting in a way that aligns with the community’s goals and values. I find that innovation in that space is very exciting, and I think really important at this time in society. Where a lot of people’s voices are silenced, a lot of divisiveness and polarization exists.
I’m not satisfied when the default is, “Okay, only talk to people you already agree with,” or, “Just deal with it. The internet’s always going to be awful and mean, just put up with it.” I don’t like either of those as the answer, [laughs] so I’m just interested in anyone who’s doing cool creative things in new ways that could change that dynamic.
[00:38:19] Patrick O’Keefe: Well, Cj it’s been great to have you on the show. Thank you for taking the time.
[00:38:23] Cj Adams: Yes, thanks so much for having me.
[00:38:24] Patrick O’Keefe: We’ve been talking with Cj Adams, product manager at Alphabet’s Jigsaw. For more on Perspective API, visit perspectiveapi.com and read their blog, The False Positive at medium.com/the-false-positive. For the transcript from this episode, plus highlights and links that we mentioned, please communtysignal.com.
Community Signal is produced by Karn Broad, and Carol Benovic-Bradley is our editorial lead. Until next time.
Your Thoughts
If you have any thoughts on this episode that you’d like to share, please leave me a comment, send me an email or a tweet. If you enjoy the show, we would be so grateful if you spread the word and supported Community Signal on Patreon.