Responsible ML Research @ Twitter

2021 / 22

As a User Researcher for Responsible ML at Twitter, I worked closely with the META (Machine Learning Ethics, Transparency, & Accountability) team. This section includes a summary of some projects I worked on. If you are interested in learning more about any of them, do not hesitate to reach out at if76 [at] cornell [dot] edu.

Algorithmic Bias Bounty for Collaborative Harm Identification

When building machine learning systems, it’s nearly impossible to foresee all potential problems and ensure that a model will serve all groups of people equitably. Direct feedback from the communities affected by our algorithms helps us design products to serve all people and communities. Inspired by bug bounties in cybersecurity, we created the first-ever Algorithmic Bias Bounty competition hosted at DEFCON AI Village 2021. We invited the ethical AI hacker community to take apart the saliency model Twitter used in production in order to identify bias and other potential harms within it. And the results were very cool! Check the winners here.

My role, alongside Kyra Yee, was to create the evaluation criteria to judge participants’ submissions, inspired by previous frameworks in privacy and security for assessing risk. Why was this difficult? The challenge here was coming up with a rubric that was concrete enough to grade and compare submissions but broad enough to encompass a wide variety of harms and methodologies. We wanted to make sure our rubric would still allow people to be really creative with their submissions and the problems they perceive. We focused on issues that have historically received less attention in fair ML research, such as representational harms, so we assigned a different number of points to different types of harms. We also encouraged qualitative analyses, grading each submission by not only their code, but their assessment of why their approach and perspective were relevant.

Twitter Blog Post: Sharing learnings from the first algorithmic bias bounty challenge, By Kyra Yee and Irene Font Peradejordi, September 2021

Algorithmic Transparency to Calibrate Users’ Trust

ML systems are not perfect, and their limitations can be unclear to end users. Over-relying on a decision made by an ML system has the potential to create severe harm in high-stakes situations. When building ML-driven products, it is critical to give users the tools they need to quickly assess how much the outcome of the system can and should be trusted. Algorithmic Transparency or explainability can help, but it is not a one-size-fits-all for all users and all contexts. For explanations to be effective, they must be embedded into a user-centered design process aimed at helping stakeholders calibrate their trust in the ML systems.

This is why Yomna Elsayed and I created a Human-Centered framework to help product teams calibrate users’ trust in ML systems by collaborating with multiple stakeholders to identify (1) what to explain, (2) how to explain it, and (3) what is technically doable for each product.

To learn more about this project, reach out at if76 [at] cornell [dot] edu.

Data-Driven ML Personas

ML practitioners are a broad and diverse user base, and a user-centered framework is requierd to organize their needs as well as to help stakeholders make strategic decisions. I led and worked alongside Twitter's directors in developing data-driven User Personas for the ML practitioner’s community. I did so using a mixed-methods research approach. First, I identified dimensions of higher variance within our population through semi-structured interviews. Then, I used the uncovered insights to craft a questionnaire to find user clusters with similar behavior patterns. I used K-Means and PCA for the cluster analysis and conducted validation interviews. I defined a total of 5 interrelated clusters, as well as the relationships between each other.

To learn more about this project, reach out at if76 [at] cornell [dot] edu.

Efficient, Reproducible, and Contextual Algorithmic Risk Assessment

Social media companies are tasked with identifying and mitigating the harms introduced by the algorithms, but there are currently no industry standards we can rely on to structure this work. During my time at the company, I led the development of an auditing framework for ML systems across the company to identify and remediate unwanted bias. Risk assessment methodologies in cybersecurity were drawn to balance the need for efficient, reproducible, and contextual ethical risk assessments for ML models in an organizational setting.

To learn more about this project, reach out at if76 [at] cornell [dot] edu.

ACM FAccT 2022 Workshop – Imagining a Collaborative Approach between Companies and Communities on Developing Harm Remediation Strategies

I led the development of a CRAFT workshop at the FAccT conference to bring together policy experts, activists, academic and independent researchers, and practitioners working in the industry in two panels to help answer two big questions: (1) how might companies collaborate with communities to develop remediation strategies to the surfaced algorithmic harms? and (2) how might companies be held accountable for implementing such proposed remediation strategies?

ACM FAccT 2022 CRAFT – Imagining a Collaborative Approach between Companies and Communities on Developing Harm Remediation Strategies, 2022