These are public posts tagged with #dl. You can interact with them if you have an account anywhere in the fediverse.
Training large language models requires extensive processing,…
hgpu.orgGPU-centric Communication Schemes for HPC and ML Applications
Compute nodes on modern heterogeneous supercomputing…
hgpu.orgГоловоломка, кофе и охапка книг, или как я искал истоки термина «Deep Learning». Часть 2
Привет! Некоторое время назад я начал искать истоки термина «Deep Learning» . Тогда я изучал только зарубежные источники и обещал вернуться позже с обзором советской и российской литературы. Что ж, откладывать это больше нельзя. Посмотрим, на кого будут ссылаться отечественные авторы в том, что касается истории развития глубокого обучения. Без долгого вступления — берем в руку пальцы Ctrl/Cmd+F и начинаем раскопки!
https://habr.com/ru/companies/selectel/articles/899050/
#selectel #ии #искусственный_интеллект #машинное_обучение #ml #dl #deep_learning #глубокое_обучение #познавательное
Привет! Некоторое время назад я начал искать истоки…
ХабрHow RamaLama helps make AI model testing safer https://buff.ly/5H8phOO #AI #ML #DL #NN #oss #opensource
Large deep learning models have achieved state-of-the-art…
hgpu.orgRed Hat named to Fast Company’s annual list of the World’s Most Innovative Companies of 2025 https://buff.ly/yJk2dvw #AI #ML #DL #NN #oss #opensource
Moore’s Law for AI agents: the length of tasks that AIs can do is doubling about every 7 months.
These results appear robust. The authors were able to retrodict back to GPT-2. They further ran experiments on SWE-bench Verified and found a similar trend.
Read more: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks
#AIBoom #AI #AIAgents #AIAgent #ArtificialIntelligence #GPT2 #MooreLaw #Tasks #DL #ML #Pustam #Raut #AIRevolution
The rise of AI research in a graph — see how its ArXiv submissions compare to other fields over the past decade. #AI #ArXiV #ArtificialIntelligence #DL #ML #CV #NLP #XAI #AIResearch #CS #ComputerScience #DataScience #Research #Revolution #AIBoom #AIRevolution
Self-Improving Reasoners.
Both expert human problem solvers and successful language models employ four key cognitive behaviors
1. verification (systematic error-checking),
2. backtracking (abandoning failing approaches),
3. subgoal setting (decomposing problems into manageable steps), and
4. backward chaining (reasoning from desired outcomes to initial inputs).
Some language models naturally exhibits these reasoning behaviors and exhibit substantial gains, while others don't and quickly plateau.
The presence of reasoning behaviors, not the correctness
of answers is the critical factor. Models with incorrect solutions containing proper reasoning patterns achieve comparable performance to those trained on correct solutions.
It seems that the presence of cognitive behaviors enables self-improvement through RL.
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
https://arxiv.org/abs/2503.01307
Test-time inference has emerged as a powerful paradigm…
arXiv.orgCRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads
Deep learning training at scale is resource-intensive…
hgpu.orgTritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
#CUDA #CodeGeneration #LLM #DeepLearning #DL #Python #Package
Triton, a high-level Python-like language designed…
hgpu.org"We encourage the open source community, regulatory authorities and industry to continue to strive toward greater transparency and alignment with open source development principles when training and fine-tuning AI models" https://buff.ly/3Eyn85w #AI #ML #DL #NN #oss #opensource
Why is transparent, open data important to LLMs (Part 2)? https://buff.ly/3QhjQ9t #AI #ML #DL #NN #oss #opensource
Why is transparent, open data important to LLMs? https://buff.ly/42UZwSV #AI #ML #DL #NN #oss #opensource
Thesis: Towards autonomous resource management: Deep learning prediction of CPU-GPU load balancing
I agree with RedMonk/O'Grady, that's why I recommend - ramalama run ollama://deepseek-r1:7b - instead of using web or app versions of DeepSeek https://buff.ly/4gDvLsR #AI #ML #DL #NN #oss #opensource
“...If LLMs are just software, then containers are really convenient for LLMs....” https://buff.ly/42zNVZs#AI #ML #DL #NN #oss #opensource
It's interesting to see the business news evangelize what Red Hat has been saying! AI in 2025: Rush for use cases https://buff.ly/4hwrxnS
#AI #ML #DL #NN #oss #opensource
Keras Sig: Efficient Path Signature Computation on GPU in Keras 3
In this paper we introduce Keras Sig a high-performance…
hgpu.org