I’ve spent a better part of today, playing with #Twitter’s internal API and trawling through GitHub for examples of the apps that use it. I think I’ve learned enough to try and code something up. This thread will be my notebook and a journal.
The minimal objective will be to write an exporter from the following list to OPML, so that I can move most of my feed into an RSS reader. And if I don’t lose the interest by then, I' may even try to write a web app for reading the feed with no frills.
All #twitter API calls require some sort of authentication.
Interestingly, the authorization header seems to be constant. I’m guessing this identifies the request as coming from Twitter’s own web UI:
authorization: Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA
The actual authentication happens elsewhere. For guest sessions, you need a `x-guest-token: 1649859312251027458` header, where there token is obtained by a separate call. Only a subset of calls are available in this mode. For the rest, you need to have a cookie from the logged in user:
cookie: auth_token=1234567890abcdef58dc6829393d4604b9e37c8a; ct0=1234567890abcdef0b09e38a20dcdd5cb6ec4cf8f2ba357187cda008b0f39273308a6b7ef6d318f609bc83563709c247e51daad090a116d775ef1fa55074cf5c235893a45f99d1cc49ac4fe61fec238d;
x-csrf-token:
1234567890abcdef0b09e38a20dcdd5cb6ec4cf8f2ba357187cda008b0f39273308a6b7ef6d318f609bc83563709c247e51daad090a116d775ef1fa55074cf5c235893a45f99d1cc49ac4fe61fec238d
Note that the `x-csrf-token` is the same as the `ct0` cookie.
Both are obtained through a somewhat involved login workflow shown in https://github.com/trevorhobenshield/twitter-api-client/blob/main/twitter/login.py
As for the API calls themselves, https://github.com/fa0311/TwitterInternalAPIDocument and https://github.com/fa0311/twitter-openapi seem to be the closest there is to documentation. I think this should be fine for my purposes.
The two calls I need are /sLVLhk0bGj3MVFEKTdax1w/UserByScreenName and /IWP6Zt14sARO29lJT35bBw/Following. Unfortunately, the second one is not available with a guest token, so I’ll have to deal with the login workflow 😒
Back at it again...
Twitter's login flow is weird. For one, rather than just sending login/password and getting the cookie, it is actually a series of API calls. Also, the purpose of some of the steps is not apparent, like LoginJsInstrumentationSubtask or AccountDuplicationCheck. No idea what those mean, but I can cargo-cult them.
I think I am beginning to understand the logic behind this API. Apparently, the logic flow can be a lot more varied and complicated that you would think. Here, at the beginning of the process the client sends all the different subtasks it can handle:
{
"input_flow_data": {
"flow_context": {
"debug_overrides": {},
"start_location": {
"location": "unknown"
}
}
},
"subtask_versions": {
"action_list": 2,
"alert_dialog": 1,
"app_download_cta": 1,
"check_logged_in_account": 1,
"choice_selection": 3,
"contacts_live_sync_permission_prompt": 0,
"cta": 7,
"email_verification": 2,
"end_flow": 1,
"enter_date": 1,
"enter_email": 2,
"enter_password": 5,
"enter_phone": 2,
"enter_recaptcha": 1,
"enter_text": 5,
"enter_username": 2,
"generic_urt": 3,
"in_app_notification": 1,
"interest_picker": 3,
"js_instrumentation": 1,
"menu_dialog": 1,
"notifications_permission_prompt": 2,
"open_account": 2,
"open_home_timeline": 1,
"open_link": 1,
"phone_verification": 4,
"privacy_options": 1,
"security_key": 3,
"select_avatar": 4,
"select_banner": 2,
"settings_list": 7,
"show_code": 1,
"sign_up": 2,
"sign_up_review": 4,
"tweet_selection_urt": 1,
"update_users": 1,
"upload_media": 1,
"user_recommendations_list": 4,
"user_recommendations_urt": 1,
"wait_spinner": 3,
"web_modal": 1
}
}
And server sort of commands the client which actions to offer to the user:
{
"flow_token": "g;168461104176909845:-1684611164737:Mh2XA15kcSPOvXshdM51j6Ea:1",
"status": "success",
"subtasks": [
{
"subtask_id": "LoginEnterUserIdentifierSSO",
"settings_list": {
"settings": [
{
"value_type": "button",
"value_identifier": "google_sso_button",
"value_data": {
"button": {
"navigation_link": {
"link_type": "subtask",
"link_id": "google_sso",
"label": "Continue with Google",
"subtask_id": "EnterIdGoogleSSOSubtask"
},
"style": "brand",
"icon": {
"icon": "logo_google_g_color"
},
"preferred_size": "normal"
}
}
},
{
"value_type": "button",
"value_identifier": "apple_sso_button",
"value_data": {
"button": {
"navigation_link": {
"link_type": "subtask",
"link_id": "apple_id",
"label": "Continue with Apple",
"subtask_id": "EnterIdAppleSSOSubtask"
},
"style": "brand",
"icon": {
"icon": "logo_apple"
},
"preferred_size": "normal"
}
}
},
{
"value_type": "separator",
"value_identifier": "separator",
"value_data": {
"separator": {
"label": {
"text": "or",
"entities": []
}
}
}
},
{
"value_type": "text_field",
"value_identifier": "user_identifier",
"value_data": {
"text_field": {
"content_type": "text",
"hint_text": "Phone, email, or username"
}
}
},
{
"value_type": "button",
"value_identifier": "next_button",
"value_data": {
"button": {
"navigation_link": {
"link_type": "task",
"link_id": "next_link",
"label": "Next"
},
"style": "primary",
"preferred_size": "normal"
}
}
},
{
"value_type": "button",
"value_identifier": "forgot_password",
"value_data": {
"button": {
"navigation_link": {
"link_type": "subtask",
"link_id": "forget_password",
"label": "Forgot password?",
"subtask_id": "RedirectToPasswordReset"
},
"style": "secondary",
"preferred_size": "normal"
}
}
}
],
"detail_text": {
"text": "Don't have an account? Sign up",
"entities": [
{
"from_index": 23,
"to_index": 30,
"navigation_link": {
"link_type": "deep_link_and_abort",
"link_id": "signup_deep_link",
"url": "https://twitter.com/i/flow/signup"
}
}
]
},
"style": "step",
"header": {
"primary_text": {
"text": "Sign in to Twitter",
"entities": []
}
},
"navigation_style": "hide",
"horizontal_style": "compact"
},
"subtask_back_navigation": "cancel_flow"
},
{
"subtask_id": "EnterIdGoogleSSOSubtask",
"single_sign_on": {
"provider": "google",
"scopes": [
"openid",
"email",
"profile"
],
"state": "j28nFz5x2qeOetxXP7RpW4hldQFpYIKWoEkFqBPDJqh",
"next_link": {
"link_type": "task",
"link_id": "next_link"
},
"fail_link": {
"link_type": "subtask",
"link_id": "fail_link",
"subtask_id": "LoginEnterUserIdentifierSSO"
},
"cancel_link": {
"link_type": "subtask",
"link_id": "cancel_link",
"subtask_id": "LoginEnterUserIdentifierSSO"
}
},
"subtask_back_navigation": "cancel_flow"
},
{
"subtask_id": "EnterIdAppleSSOSubtask",
"single_sign_on": {
"provider": "apple",
"scopes": [
"email",
"name"
],
"state": "TPt3CJRXQfJaN3tjB3QPEi_FS_WtsOlj68qfoeTGmx4",
"next_link": {
"link_type": "task",
"link_id": "next_link"
},
"fail_link": {
"link_type": "subtask",
"link_id": "fail_link",
"subtask_id": "LoginEnterUserIdentifierSSO"
},
"cancel_link": {
"link_type": "subtask",
"link_id": "cancel_link",
"subtask_id": "LoginEnterUserIdentifierSSO"
}
},
"subtask_back_navigation": "cancel_flow"
},
{
"subtask_id": "RedirectToPasswordReset",
"open_link": {
"link": {
"link_type": "deep_link_and_abort",
"link_id": "password_reset_deep_link",
"url": "https://twitter.com/i/flow/password_reset?input_flow_data=%7B%22requested_variant%22%3A%22eyJwbGF0Zm9ybSI6IlJ3ZWIifQ%3D%3D%22%7D"
}
}
}
]
}
One trick that always helps when reverse engineering something popular: searching for unique strings in Google and other search engines turns up other people's notes.
Some thing this time, AccountDuplicationCheck_false pointed me at https://github.com/fa0311/TwitterFrontendFlow and https://github.com/tsukumijima/tweepy-authlib, which look like very detailed implementations of the login flow I could reference.
Part of the challenge is that Twitter sometimes shows stull like "confirm your email" when it feels suspicious activity, but I can't reliably reproduce and test such behavior. Looking at other people's code helps find such cases before they randomly break at some future point.
ran into an interesting gotcha. Some API endpoints expect GET requests rather than POST, for example https://twitter.com/i/api/graphql/zC51NksbixfctE9X0ITB-Q/Viewer. But, if you send a POST request by mistake they won't just throw a 405 Method Not Allowed, but act as if it didn't recognize any GET query parameters: The following features cannot be null: responsive_web_graphql_exclude_directive_enabled, verified_phone_label_enabled, responsive_web_graphql_skip_user_profile_image_extensions_enabled, responsive_web_graphql_timeline_navigation_enabled, blue_business_profile_image_shape_enabled. Which makes me wonder if the server is interpreting it as application/x-www-form-urlencoded?..
Not sure if anything interesting can be done with this.
Another weirdness of Twitter's API: it alludes to be graphql (like, in the URL: https://twitter.com/i/api/graphql/q4cKckK0lNxWkHfAXXXzJQ/Following), but I don't think it actually is?
Until now I haven't had to use GraphQL for anything, so I could be wrong, but it's nothing like http://graphql.org/learn describes. Maybe they use GraphQL no the backend to generate responses for those APIs? But then I thought the whole point was to let the client make their own queries.
TBH, Twitter's API is one of the odder ones I've seen. It seems to be built around the paradigm of the backend telling the frontend what to do. Like, fetching timeline isn't just "give me a list of tweets after X", but the backend sending you instructions what to add and what to remove from the timeline.
Just to illustrate my point for how bizarre Twitter API is, where is an example of the https://twitter.com/i/api/graphql/q4cKckK0lNxWkHfAXXXzJQ/Following response. Just look how complex this thing is for the purpose of showing a list of users.
I understand stuff like "historical reasons" and for a system as big as Twitter they definitely must be at play, but I'd really be curious to learn those reasons...