**Johannes Hoffart** · Aug 27, 2023

Johannes Hoffart

Johannes Hoffart @johannes@qoto.org

78 Posts

134 Following

77 Followers

Website: https://www.hoffart.ai

CTO, AI at SAP - #nlproc #kg #ai (views are my own and do not reflect those of my employer)

Joined Sep 2019

134 Following 77 Followers

Posts Posts and replies Media

Johannes Hoffart boosted

**Ben Lorica 罗瑞卡** @bigdata@indieweb.social · Aug 27, 2023

Aug 27, 2023

Ben Lorica 罗瑞卡 @bigdata@indieweb.social

#TheDataExchangePod Michele Catasta of Replit explores the potential of #AI for software development. Find out how #LLMs & foundations models are helping developers code faster, better, and more efficiently.
#Nlproc #LLM #MachineLearning https://thedataexchange.media/software-development-with-ai-and-llms/

Software Development with AI and LLMs

Michele Catasta on AI tools for building software collaboratively.

The Data Exchange

**Johannes Hoffart** · Jul 15, 2023

Johannes Hoffart boosted

**Ben Lorica 罗瑞卡** @bigdata@indieweb.social · Jul 15, 2023

Jul 15, 2023

Ben Lorica 罗瑞卡 @bigdata@indieweb.social

#TheDataExchangePod the amazing Jeff Jonas of Senzing explains how #BigData, #AI, & real-time processing redefine #EntityResolution and #MasterDataManagement. Learn valuable insights and leverage lessons in accuracy, scale, and complexity. Expand the scope of your AI applications and boost efficiency like never before!
#datascience #dataquality #datacentricai #machinelearning #ai
https://thedataexchange.media/using-data-and-ai-to-democratize-entity-resolution-and-master-data-management/

Using Data and AI to Democratize Entity Resolution and Master Data Management

Jeff Jonas on how Senzing makes entity resolution easier and more effective.

The Data Exchange

**Johannes Hoffart** · Jun 2, 2023

Johannes Hoffart boosted

**Alexander Nowak** @alenowak@masto.ai · Jun 2, 2023

Jun 2, 2023

Alexander Nowak @alenowak@masto.ai

@SebRaschka finished his excellent free deep learning fundamentals course…highly recommended to watch: https://lightning.ai/pages/courses/deep-learning-fundamentals/unit-1/

Welcome to Machine Learning and Deep Learning | Unit 1

Lightning AI

**Johannes Hoffart** · Jun 10, 2023

Johannes Hoffart boosted

**Sebastian Raschka** @SebRaschka@mastodon.social · Jun 10, 2023

Jun 10, 2023

Sebastian Raschka @SebRaschka@mastodon.social

It's been a another wild month in AI & Deep Learning research.
I curated and summarized noteworthy papers here:

https://magazine.sebastianraschka.com/p/ai-research-highlights-in-3-sentences-2a1/

Ranging from new optimizers for LLMs to new scaling laws for vision transformers.

AI Research Highlights In 3 Sentences Or Less (May-June 2023)

This article is a compilation of 23 AI research highlights, handpicked and summarized. A lot of exciting developments are currently happening in the fields of natural language processing and computer vision! In addition, if you are curious about last month's highlights, you can find them here:

magazine.sebastianraschka.com

**Johannes Hoffart** · Jun 3, 2023

Johannes Hoffart boosted

**Ines Montani** @ines@sigmoid.social · Jun 3, 2023

Jun 3, 2023

Ines Montani @ines@sigmoid.social

Here are the slides for my #PyDataLondon keynote on LLMs from prototype to production

Including:
visions for NLP in the age of LLMS
a case for LLM pragmatism
solutions for structured data
spaCy LLM + https://prodi.gy

https://speakerdeck.com/inesmontani/large-language-models-from-prototype-to-production

**Johannes Hoffart** · Jun 3, 2023

Johannes Hoffart boosted

**Sebastian Raschka** @SebRaschka@mastodon.social · Jun 3, 2023

Jun 3, 2023

Sebastian Raschka @SebRaschka@mastodon.social

A new Ahead of AI issue is out, where I am covering the latest research highlights concerning LLM tuning and dataset efficiency:

https://magazine.sebastianraschka.com/p/ahead-of-ai-9-llm-tuning-and-dataset/

Ahead of AI #9: LLM Tuning & Dataset Perspectives

In the last couple of months, we have seen a lot of people and companies sharing and open-sourcing various kinds of LLMs and datasets, which is awesome. However, from a research perspective, it felt more like a race to be out there first (which is understandable) versus doing principled analyses.

magazine.sebastianraschka.com

**Johannes Hoffart** · Jun 3, 2023

Johannes Hoffart boosted

**Ben Lorica 罗瑞卡** @bigdata@indieweb.social · Jun 3, 2023

Jun 3, 2023

Ben Lorica 罗瑞卡 @bigdata@indieweb.social

New Newsletter Get the inside scoop on GPT-4 & PaLM 2. Unpack the intricacies of these foundation models and understand the evolution of #LLMs
#NLproc #MachineLearning #DataScience
https://gradientflow.substack.com/p/what-you-need-to-know-about-gpt-4

What You Need to Know About GPT-4 and PaLM 2

Subscribe • Previous Issues Behind the Curtain: Unpacking GPT-4 and PaLM 2 As AI technology continues to evolve, we are witnessing a shift in the openness of systems. Leading-edge AI models, such as GPT-4 (OpenAI) and PaLM 2 (Google), are trending towards being less open than their predecessors. This shift is being driven by a number of factors, including the mounting costs of training these models, the need to protect intellectual property, and concerns over potential misuse.

gradientflow.substack.com

**Johannes Hoffart** · Jun 3, 2023

Johannes Hoffart boosted

**Ben Lorica 罗瑞卡** @bigdata@indieweb.social · Jun 3, 2023

Jun 3, 2023

Ben Lorica 罗瑞卡 @bigdata@indieweb.social

#TheDataExchangePod: Are you looking for dependable, trustworthy #AI solutions for your company? Jonas Andrulis of Aleph Alpha explains what it takes to build/deploy reliable, source-cited responses, specializing in the legal, healthcare, and financial sectors.
https://thedataexchange.media/building-ai-for-enterprises/

Building and Deploying Foundation Models for Enterprises

Jonas Andrulis on packaging and building Foundation Models to provide enterprise AI solutions that are more reliable and trustworthy than anything else on the market.

The Data Exchange

**Johannes Hoffart** · Mar 26, 2023

Johannes Hoffart boosted

**Bob DuCharme** @bobdc@mas.to · Mar 26, 2023

Mar 26, 2023

Bob DuCharme @bobdc@mas.to

New blog entry: Normalizing company names (and more) with SPARQL and Wikidata. As a service! https://www.bobdc.com/blog/wikidatanormalizing/

Normalizing company names (and more) with SPARQL and Wikidata

As a service!

www.bobdc.com

**Johannes Hoffart** @johannes@qoto.org · Feb 27, 2023

**Johannes Hoffart** @johannes@qoto.org · Feb 27, 2023

Feb 27, 2023

Johannes Hoffart @johannes@qoto.org

Podcast: Wie das Arbeiten mit SAP Software durch Künstliche Intelligenz beeinflusst wird
openSAP https://podcast.opensap.info/education-newscast/
Apple https://podcasts.apple.com/de/podcast/enc235-wie-das-arbeiten-mit-sap-software-durch-k%C3%BCnstliche/id1352307529?i=1000601764433
https://open.spotify.com/show/5hsuELMNmsySwkMeZDcC4d
#ai #ki

**Johannes Hoffart** · Feb 20, 2023

Johannes Hoffart boosted

**Gaël Varoquaux** @GaelVaroquaux@mastodon.social · Feb 20, 2023

Feb 20, 2023

Gaël Varoquaux @GaelVaroquaux@mastodon.social

Big release of dirty-cat
https://dirty-cat.github.io/stable/

Broader focus: simplifying preparing non-curated dataframes for machine learning.
Encoding of messy dataframes: a strong baseline for easy machine learning
fuzzy_join: joining dataframes (pd.merge) despite typos
Deduplication: matching categories with typos
Feature augmentation: joining on an external data source to enrich tabular data
Embedding of cites, companies, locations...

**Johannes Hoffart** · Feb 20, 2023

Johannes Hoffart boosted

**Gaël Varoquaux** @GaelVaroquaux@mastodon.social · Feb 20, 2023

Feb 20, 2023

Gaël Varoquaux @GaelVaroquaux@mastodon.social

Tabular data can benefit from merging external sources of information.

The FeatureAugmenter is a sklearn transformer to augment a given dataframe by joins on reference tables.
https://dirty-cat.github.io/stable/generated/dirty_cat.FeatureAugmenter.html

fuzzy_join makes it robust to mismatch in vocabulary. Hyperparameter optimization can tune matches for prediction

For such external information,
diry-cat can download embeddings of wikipedia data on millions of entities: companies, cities, geographic locations...
https://dirty-cat.github.io/stable/auto_examples/07_ken_embeddings_example.html

**Johannes Hoffart** · Feb 19, 2023

Johannes Hoffart boosted

**Sebastian Raschka** @SebRaschka@mastodon.social · Feb 19, 2023

Feb 19, 2023

Sebastian Raschka @SebRaschka@mastodon.social

Productive weekend! Just added 4 new Q&A's!

- Multi-GPU Training Paradigms
- The Distributional Hypothesis
- "Self"-Attention
- Training & Test Set Discordance

And "Machine Learning Q and AI" just crossed the 50% milestone!

PS: I included the Multi-GPU Training Paradigms section is in the free preview at
https://leanpub.com/machine-learning-q-and-ai/

Machine Learning Q and AI

Have you recently completed a machine learning or deep learning course and wondered what to learn next? This book covers 30 key concepts in machine learning and AI. Topics include:Explanations of multi-GPU training paradigms.Finetuning transformers.Differences between encoder- and decoder-style LLMs.

leanpub.com

**Johannes Hoffart** · Feb 5, 2023

Johannes Hoffart boosted

**Gerard de Melo** @gdm@mastodon.social · Feb 5, 2023

Feb 5, 2023

Gerard de Melo @gdm@mastodon.social

This Politico article argues that #NMT (Neural Machine Translation) was one of the major drivers of European unity against the Russian invasion of Ukraine.

Not sure I fully buy the argument, but MT is probably one of the best examples of #NLProc / #AI that benefits society / #AIforGood.

https://www.politico.com/news/magazine/2023/02/03/europe-putin-ukraine-google-translate-00079301

h/t @daanvanesch

The Surprising Reason Europe Came Together Against Putin

A major advance in translation technology means that Ukrainians can inform and debunk in real time. The world hasn’t seen a weapon quite like it before.

www.politico.com

**Johannes Hoffart** · Jan 27, 2023

Johannes Hoffart boosted

**SEMANTiCS Conference** @semantics@sigmoid.social · Jan 27, 2023

Jan 27, 2023

SEMANTiCS Conference @semantics@sigmoid.social

CALL FOR PAPERS: Research and Innovation Track

We welcome papers on novel scientific research and/or innovations relevant to #SemanticWeb, #KnowledgeGraphs, #AI, #ML, #NLP and more

Deadlines:
Abstracts: May 09
Papers: May 16

For more info: https://2023-eu.semantics.cc/page/cfp_rev_rep

**Johannes Hoffart** · Jan 24, 2023

Johannes Hoffart boosted

**heise online** @heiseonline@social.heise.de · Jan 24, 2023

Jan 24, 2023

heise online @heiseonline@social.heise.de

Machbarkeitsstudie: KI-Leuchtturmprojekt in Deutschland möglich

ChatGPT kommt aus den USA. KI-Experten in Deutschland sehen darin ein Problem und fordern einen Kraftakt zur Wahrung der digitalen Souveränität.

https://www.heise.de/news/Machbarkeitsstudie-KI-Leuchtturmprojekt-in-Deutschland-moeglich-7468862.html?wt_mc=sm.red.ho.mastodon.mastodon.md_beitraege.md_beitraege

#Bundeswirtschaftsministerium #ChatGPT #Deutschland #Europa #KünstlicheIntelligenz #LEAM #Sprachmodelle #Supercomputer #USA #digitaleAssistenten

**Johannes Hoffart** · Jan 23, 2023

Johannes Hoffart boosted

**heise online** @heiseonline@social.heise.de · Jan 23, 2023

Jan 23, 2023

heise online @heiseonline@social.heise.de

In eigener Sache: heise online zieht auf eigene Mastodon-Instanz

Das Chaos bei Twitter hält an und die Mastodon profitiert weiter. Heise Medien betreibt in dem Fediverse-Netzwerk nun eine eigene Instanz.

https://www.heise.de/news/Twitter-Alternative-Mastodon-heise-online-zieht-auf-eigene-Instanz-7465214.html?wt_mc=sm.red.ho.mastodon.mastodon.md_beitraege.md_beitraege

#Fediverse #Heise #Mastodon #SocialMedia #Twitter #TwitterÜbernahme #heiseonline

**Johannes Hoffart** · Jan 21, 2023

Johannes Hoffart boosted

**Codeberg.org** @Codeberg@social.anoxinon.de · Jan 21, 2023

Jan 21, 2023

Codeberg.org @Codeberg@social.anoxinon.de

Do you love #selfhosting? What about providing service to the public via #Codeberg?

We are looking for maintainers that take on adding code search features to our #Forgejo instance to reduce the load on the existing infrastructure team and bring this project forward.

Please see https://codeberg.org/Codeberg/Community/issues/904 if you are interested.

We are looking forward to your contributions. Thank you a lot!

Code Search: Looking for maintainers

An existing issue (#379) tracks the feature to enable code search on Codeberg. To move this forward, I'd appreciate to create a team of people available to experiment with it and maintain the setup in the future. The plan could look as follows: - discuss the resource allocations regarding memory, CPU and disk storage with the Codebeg infrastructure team - an LXC container is created matching the above results on our infrastructure - investigate the use of [ZincSearch](https://github.com/zinclabs/zinc) vs OpenSearch - configurate Forgejo / Gitea to connect to the setup using a test instance - adapt and finish [this pull request](https://codeberg.org/Codeberg/forgejo/pulls/47) to enable code search per repository (opt-in) - enable code search on the production instance and continue the maintenance of the setups inside the LXC container If this sounds interesting to you, please reach out here. We're available to collaborate closely with you, however it would be great if a dedicated team could push the effort and iterate independently. Thank you a lot! Useful links: - Forgejo config: https://docs.gitea.io/en-us/config-cheat-sheet/#indexer-indexer - #379 with experiences - connect Forgejo to ZincSearch: https://github.com/zinclabs/zinc/issues/538#issuecomment-1251748395 - consider Sourcegraph as an alternative to Forgejo-built-in search: https://about.sourcegraph.com/

codeberg.org

**Johannes Hoffart** · Jan 13, 2023

Johannes Hoffart boosted

**danveloper** @danveloper@infosec.exchange · Jan 13, 2023

Jan 13, 2023

danveloper @danveloper@infosec.exchange

The most interesting thing about #ChatGPT that no one is talking about is how the future will be systems talking to each other with imprecise protocols but they’re still able to understand

**Johannes Hoffart** · Jan 20, 2023

Johannes Hoffart boosted

**Tim Rocktäschel** @rockt@sigmoid.social · Jan 20, 2023

Jan 20, 2023

Tim Rocktäschel @rockt@sigmoid.social

And the year has barely started!

RT @MishaLaskin@twitter.com

In-context RL at scale. After online pre-training, the agent solves new tasks entirely in-context like an LLM and works in a complex domain. One of the most interesting RL results of the year. https://twitter.com/FeryalMP/status/1616035293064462338

: https://twitter.com/MishaLaskin/status/1616066421582176258

Feryal on TwitterTwitter

Show older

Website: https://www.hoffart.ai

CTO, AI at SAP - #nlproc #kg #ai (views are my own and do not reflect those of my employer)

Joined Sep 2019

Johannes Hoffart @johannes@qoto.org

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…