Apiiro Blog ﹥ Part 1: What we learned about…

Technical

Part 1: What we learned about AppSec programs from the Twitch code leak

Published October 7 2021 · 3 min. read

On Wednesday, Oct. 7 2021, an anonymous 4chan user claimed to have posted 125 GB of data from 6,000 internal Git repositories. Twitch confirmed the massive data leak, including source code and creator earnings, and stated that the breach was due to a “server configuration change”.

While there will be many negative repercussions of this breach, it does provide us with a trove of raw data that we can use to better understand the SSDLC of a typical, security-aware company and see if other organizations can learn from this failure.

Specifically, by using our Code Risk Platform, we have analyzed the source code and other data that is now freely-accessible from the Internet, and we were able to gain critical insights into the Twitch application security program. In fact, our findings prove that Application Security is hard and code risk is multidimensional.

Key examples

AWS secrets in code

Twitch Entitlement Service code is written in Go, which is a language of choice by many of the packages involved in the leak, the package is rather extensive and seems to be part of crucial operations – it is not bare of mishaps of hardcoded AWS secrets, we’ve censored the actual keys as the leak is still fresh.

import (

"time"

"github.com/aws/aws-sdk-go/aws/credentials"

"code.justin.tv/commerce/AmazonMWSGoClient/mws"

)

const (

awsID = "AKIAJQT6A3xxxxxxxx"

awsSecret = "HwlMcT0u/s8GWBRA4J95WgP3xxxxxxxxxxxxxxxx"

awsToken = ""

hostSpecific = false

marketplace = "us-west-2"

serviceName = "TwitchEntitlementService"

client = ""

environment = "TwitchEntitlementService/NA"

hostname = "Visage"

partitionID = ""

customerID = ""

Twitch secrets in a dedicated secrets file

Test files can be overlooked many times as they are handled with less care and security practices rigor. Also, threat intelligence indicated that these are being investigated by different actors.

// TWTestTwitchKitSecrets.m

// Pods

// Created by Borders, Heath on 10/3/16.

#import "TWTestTwitchKitSecrets.h"

NSString * _Nonnull const TWTestTwitchKitClientID = @"85lcqzxpb9bqu9z6ga1ol55du";

NSString * _Nonnull const TWTestTwitchKitClientID2 = @"p9lhq6azjkdl72hs5xnt3amqu7vv8k2";

NSString * _Nonnull const TWTestTwitchKitClientSecret = @"sc0qn64ihq2m3f47zl2x626jdd1sm6x";

NSString * _Nonnull const TWTestTwitchKitClientSecret2 = @"jwdzcotxzyftca5w8m1w9ib8jpo6lto";

NSString * _Nonnull const TWTestTwitchKitClientIdentifer = @"126250792";

// iostest:test$1234

NSString * _Nonnull const TWTestTwitchKitRefreshToken = @"zjjifl3utnrdl9nty2rl1u1r27f7cx4qahtbxnduls53crma78";

// ios_1088:test$1234

NSString * _Nonnull const TWTestTwitchKitBitsUserRefreshToken = @"eyJfaWQiOiIiLCJfdXVpZCI6ImM5NmYzYzBiLTQxOWItNDI5NS1hNzg5LTJkOTYzODQyYjVmYyJ9%SNk/aF6wrjCKLfkAubPwAM3yMsTX/Nl2Hi61sXYFNjQ=";

Private Keys

A package that dates back to Justin.tv, the predecessor brand for Twitch from before 2014, contains a private key. We cannot assume it hasn’t been revoked since, but in any case – the practice is to not include private keys in any form in code repositories.

-----BEGIN RSA PRIVATE KEY-----

MIIEpAIBAAKCAQEA8emj4kzS0BP7U9ixEX8vxe9Df5QzXTIpc+289EH9VJbcR4QJ

q1FTlA7MLE3WKXSMCfn5wU3fMJeCz1S8u+qcOhbpohW1KxRD937C/YtZK7EUhzyF

1EGNDH0w+3tec5gU/wWGx565WlJvNwxeFkUYb6lGbXWVNRBWxdSRWVJP2aRHKT3N

C1gENCoYFoPN051moQviIsLlCCFR9SLJtFXc6NDBMbndzFjLMRRMfGca7bTuuNnl

rGdqb7TJp7ETgD2wiWSP62hsv83LgpIb23JiV24H9le/SIF75Y7T57A4NsdhMlL9

oMO2WwmYNvgePEqsyd/U7TPxkR6CvGMiXxrn2wIDAQABAoIBAHLjyI6Qd8qUwtcm

Yanyoqi5om/z3ZUUXrWNIiFLOdozr7hTUBhKDoyRnowoB182189hJimVJzu3qUt4

bg49NSctfJYbAyjLfiAL1uV9icMDXcGAj/qniyp0RpAZHll9z/LyF/m0O0lXPzSA

riqbdCiL10PjBRLniJ55/vHR8tRkkXok+hD2wt6H6XlJkPmSiro5tCbMkpHffWdt

vy6fAlrb2TI15XA/J9wg4IGhyb/L6GRg5baL4BX1tw08j4qSIzDXOWMWAnNh0v6i

OmpnApOaDC7l6NNgxyFLM8KaV00ej5HwXfMOEUz0mPmAva1KhUotNaklD4uW2CQp

uuD2w/ECgYEA+X97SbEFlwXjvYbwzTa4KEn45zXrC9jb9L0AEc0CWsc49UxyTE0l

FopWD/1wqJSkX6jmk5EPXt2+tjicOuUn2HSFB74myA4yNJLHpzHXbDYSUqWtFd/e

Zpj8RbHC7Cq6NUAaqKZaaq1KyVBNUbx9KfJ3NCc9eRUJJp5kVpLSHVMCgYEA+DeN

qH0dXnjklZbcdPHYod0DWYpPpjP7u4jtdQYgezLfLg8vRl6V2j65l52MdR1pjPqj

KxeruFobikzxbF1zWMCF/mpW6JQxeiEbAo2lN9NlwsebLTGmDFdzh97wh71XHbaG

NMppINGMlmlfFDQhz8FFeeQqudnRxRM2qMVJslkCgYEA7K//aZ1BzE+OCVJmRofO

lIn4Un9YB9kmcTqLQlfWEABHDI4FMFVPBd8eXfT0VzkL5qP4ea13g2uhbISv0T9r

WXDQctP1PnwZLL7CIN6rmsCBCV6aoNHLzlD7obJNVHYESFgT8kI+LE1RUUGY2B2U

L6MRaqx/KMrH75b7YRXPtnkCgYEAoq92jzoBp8vAtjK8p4FjpSNAcM1wStTDZzTl

vc+YNmcvU/br20lfGj4GUlMWniP67EXR8AqBqECW0FyB166gTUlSCWAVOjb2/r73

/wJriV1q0vEUydhCptAijqkWKUF1+amJ6MvJf5MYe/TwNkO87XgVW0CqqEkVbf+b

0Z4NIXECgYBfe99o+/TgLqcLXFfT/cL2coCmwhXmsoechDUJhhorJHjCnoUBH3Dr

4myZEkJNsiwgPJyhn2d9zwqlWyfpPXPPBVtOiPZPxeOmO9KzagLLAspvzjX9r/JB

NddFcdfSrPTLmFOyTm9WTKURP41H4DuCaoBFahNouKfiehoD36LTYg==

-----END RSA PRIVATE KEY-----

Google API hardcoded key

In, no doubt, one of the most important packages regarding video content and the largest of them all in size – does involve many contributors to the code. This package included many alleged secrets in code, which every code scanner will bring up to surface but without context it can easily turn into an Alert Fatigue situation where you cannot prioritize those any more and slips like this is happening.

namespace {

const uint32_t kIdPrefixOffset = 3;

const std::string kURLPrefix = "https://www.googleapis.com/youtube/v3/videos";

const std::string kURLPagePrefix = "www.youtube.com/watch?v=";

const std::string kParamId = "?id=";

const std::string kURLSuffix = "&part=snippet%2CcontentDetails%2Cstatistics"

"&key=AIzaSyCGyZFpJnjHl8Bj1fTgcyq5hBUp-0wASRo";

It is concatenated as part of ConstructedURL function later in the same file:

std::string ConstructURL(const std::string& video_id) {

std::string url = kURLPrefix + kParamId + video_id.substr(kIdPrefixOffset)

+ kURLSuffix;

return url;

}

Which, in turn is called by a video-download function:

bool DownloaderYoutubeVideoInfo::YTVideoInfoImpl::Download(

const std::string& video_id) {

std::string url = ConstructURL(video_id);

if (!curl::HTTPDownloaderBasic::Download(url)) {

TLOG(ERROR) << "Download failed for " << url;

return false;

}

info_creator_->set_video_id(video_id);

if (!ParseInternal()) {

TLOG(ERROR) << "Parsing failed for " << url;

return false;

}

Facebook keys, database passwords on django/python

Developer awareness is the back-bone of any application security program, we’ve seen some cases which this awareness got consciously evaded as this one:

1 2	# Make this unique, and don't share it with anybody. SECRET_KEY = 'r-+!3_et@czju98^v=hprrrzqzibo!4w4&dy9p^9d3li49t=$9'

Terraform includes DB password

Terraform .tf files, by definition, contain a declarative representation of infrastructure to be raised to better cloud operations, in several cases we saw an abuse of those mechanisms to institute hard-coded violations.

resource "aws_db_instance" "db" {

identifier = "${var.project_name}-${var.environment}"

auto_minor_version_upgrade = "false"

engine = "postgres"

multi_az = "true"

instance_class = "db.t2.large"

allocated_storage = 10

apply_immediately = "false"

backup_retention_period = 30

name = "metabase"

username = "metabase"

password = "pBKr2pkTVva"

publicly_accessible = "false"

db_subnet_group_name = "${aws_db_subnet_group.db_subnet_group.name}"

vpc_security_group_ids = ["${data.terraform_remote_state.remote_state.twitch_subnets_sg}"]

}

Network Device BGP block Password

Network device configuration repositories are also part of the trove leaked, as they are pretty much standard for the last 20+ years, some of their faults are there for legacy reasons. In this case a ‘password 7’ declares the type of MD5 hash in the Arista platform which the file intended to be used against.

The block is setting a BGP neighborhood, which many files containing the same malpractice.

router bgp 64516

distance bgp 20 200 20

graceful-restart stalepath-time 30

maximum-paths 32 ecmp 32

neighbor MA peer-group

neighbor MA remote-as 64516

neighbor MA fall-over bfd

neighbor MA allowas-in 3

neighbor MA password 7 fB4XWf3mJFQdgqwjstGhoQ==

neighbor MA send-community

neighbor MA maximum-routes 12000

neighbor MCS peer-group

neighbor MCS remote-as 64514

neighbor MCS fall-over bfd

neighbor MCS allowas-in 3

neighbor MCS route-map mcs-out out

neighbor MCS send-community

neighbor MCS maximum-routes 12000

neighbor MDS peer-group

neighbor MDS remote-as 64515

neighbor MDS fall-over bfd

neighbor MDS allowas-in 3

neighbor MDS route-map mds-out out

neighbor MDS password 7 k2aRtzsXB1d9lmz/Tv5FyQ==

neighbor MDS send-community

neighbor MDS maximum-routes 12000

neighbor 10.36.98.238 peer-group MA

neighbor 10.36.98.238 description INTERNAL-MA:r708-ma01.pdx05:1:64516::

neighbor 10.36.113.33 peer-group MDS

neighbor 10.36.113.33 description INTERNAL-MDS:r700-mds02.pdx05:1:64515::

redistribute connected route-map connected-to-bgp

Closing Thoughts

Our Code Risk Platform analysis automatically identified multiple key insights:

Unnecessary False Positives. Many false positives from vulnerability scanning tools like SAST and SCA can be reduced in risk with context (e.g., tests, very old code, example snippets)
# of Incidents Rises with Activity. Once a repository becomes very active, the number of incidents per archive becomes more and more inevitable. Organizations need to harness context to it in order to gain priority and proper visibility into risk (either lower or higher).
Risk is Often Underestimated. Many true positives can be receiving a higher risk when considered with context (relational connection to an core infrastracture platform, or when code is related to authentication and authorization mechanisms)
Code drift and legacy code is often a problem. There is significant usage of code that was written a long time ago (some indications point to 2014 and even suggest code from 2011) that is still being used in crucial parts of the software (video, stream content management). No one bothers to look at it and prioritize existing security issues.
Everything should be treated as code. Organizations need to evaluate configurations for BGPs for network devices, etc. as code. This enables risk detection before those settings are put into production.
Secrets in Code Detection is missing context. Secret exposure is hard to tackle without the proper visibility into the code and its history. To find so many secrets in the Twitch code that would be detected by a simple secrets scanning tool indicates that the AppSec team was unable to see and properly prioritize these secrets.

Twitch is a large, security-aware organization. Finding secrets in code and other security risks is still not in the least bit surprising. Application Security is HARD. It involved complex processes, tools, and highly-technical skill sets. What we can learn is that organizations need to build an Application Security program. Identifying vulnerabilities and looking for security issues in silos isn’t enough. It’s essential to understand the history of your code bases and identify risks in the context of the entire application and its infrastructure.

appsec secrets-in-code security-research