Show more

There's truth in the Embrace, Enhance, Extinguish theory. But #Meta has its own strategy. It's called Copy, Acquire, K*ll.

I wrote about that strategy as it relates to #threads and #mastodon. I also have some insight for what Zuck might be up to. It's not what you think.

fromjason.xyz/p/notebook/copy-

(157/200)

When actively scraping, the main starting function is

document.querySelectorAll()

This will return a NodeList, which typically one will use a for-loop to loop over each item.

On each item either the querySelector or querySelectorAll will be applied recursively until all specific data instances are extracted.

This data is then saved into various formats depending on future processing, either as on object in an array or as a string, which is then saved either to the localStorage, sessionStorage, IndexDB, or downloaded via a temporal link.

Show thread

(156/200)

The question persists why one should learn how to scrape? The obvious answer is to get data from the webpage. Though further reasons are to learn how to evaluate a website and then build extensions to present the page to one’s liking.

Although web scraping might have a negative connotation, how much different is it from skimming literature and choosing the specific patterns. And with AI/LLM on the rise, now one can evaluate texts even quicker.

Show thread

This is a majestic view of Mt. Fuji surmounted by lenticular clouds while reflected in a lake.

[📸 Taitan21] #Japan #MtFuji #photography

(155/200)

To actively scrape a one employs either an extension or uses the console.

Here the difference is where and who maintains the code. The benefit of using the is that one is browser agnostic and still can keep a level anonymity. Whereas with an extension could be used as a fingerprint marker.

E.g. if using the browser one should not diverge from the installed extensions, since one will easier identified compared to the herd. Using the console would be preferred in this case.

On the flip side using an extension voids the need to copy and paste the code into the console every time.

Show thread

(154/200)

To passively scrape a webpage one uses automation tools, ideally headless browsers like or . Of course one can use any tool that is typically used for testing in the .

The biggest obstacle for passively scraping is dealing with either or .

There are options to use captcha farms for a small monetary fee. And Cloudflare can be over come by IP hopping.

In general, passively scraping only works on websites that were poorly configured.

Show thread

Android Auto support for our sandboxed Google Play compatibility layer has been merged into GrapheneOS and should be available in the next release. It's currently going through final review and internal testing leading up to being able to make a public Alpha channel release.

#GrapheneOS #privacy #security #AndroidAuto

(153/200)

There are two main ways to a , either actively or passively.

Active scraping is the process of using a trigger to actively scrape the already loaded webpage.

Passive scraping is the process of having the tool navigate to the webpage and scrape it.

The main difference is how one is getting to the loaded .

(152/200)

Not only hardware is a concern, though also internet speed. Lots of websites use some kind of media like images or videos and many don’t convert these to slow internet friendly speeds.

For images WEBP suffices and for videos a bit rate of 8Mbits.

Show thread

(151/200)

Lots of websites these days are being first built on the client. This can easily be checked when downloading the does not align with the from the inspector.

This has the benefit for the provider to save transfer cost, though on the flip side, the client will need to have a specific amount of to successfully render the site.

(150/200)

Designing themes with is fairly straightforward, the difficulty is creating a or color palette in the first place.

In this approach the “import full palette” method was chosen. This consists of importing the color palette and assigning each color an unique identifier. The type ThemeDefinition exists to help with naming conventions. The addition name to add is accent which should fit well with the primary and secondary colors.

Later when the is being built one can directly choose from the palette.

My Ruck Club  
#devlog - New color theme with dedicated color palette and dark/light toggle button. - Markers on map finally have a dialog popup (which was one f...

(149/200)

that the female has on average proportionally shorter legs in respect to their body height compared to their male counterparts.

This could also be the explanation why on average the female human is biologically predisposed to be able to touch their feet with their hands while keeping their legs straight.

(148/200)

Currently just imagining the idea of surpassing $w$ blog count with the DailyBloggingChallenge $d$.

To calculate the weeks number $x$, one sets $w(x) = d(x)$ with $w(x) = 700 + x$ and $d(x) = 148 + 7x$. This makes $x = 92$ weeks or 645 days until surpassed.

So just before (relatively speaking) edition, the DailyBloggingChallenge would overhaul it.

Well before fantasizing of hypothetical goals, I should stick with the current goal of 200.

Show thread

(147/200)

As an active participant of the @weeklyOSM project which is celebrating its 700th weekly news update, one gets to admire a long lasting community project.

This would put the first edition almost 14a ago, which is only a couple years after the project started.

There are a lot of people working behind the scenes of gathering the news stories, writing up a small summary, translating these into the variously languages, proof reading, and finally publishing at the end of the week.

(146/200)

One downside of opening its borders to be part of is that when flying into with is that there is a high chance that it will continue its flight outside of Schengen. This means that the plane will park at the international terminal making it more convenient for the upcoming passengers. Though the current passengers get to be conveyed with buses to the terminal.

On the flip side when flying out to Schengen area is similar as before with access to most amenities.

They put border patrol right in front of the gate. Instead of having it right after security which most airports follow.

(145/200)

The TagTable has a photo_ids_list which contain each photo or video with the specific tag. Further each id is either prepended by thumb for photo and video- for video. The id itself is saved as a hexadecimal number.

Once one has the id’s as a decimal number, one can use it to search through either the PhotoTable or VideoTable.

Show thread

(144/200)

One downside of using as a photo and video manager is that it isn’t that straight forward to extract tags from videos as it is with photos. With photos one can grab the data directly with an tool.

After going through the source code and battling with and , I got a script that extracts tags from both media formats.

The nice thing is that it is more performant than the previous script I was using only for images.

Show more
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.