#feedforward
Explore tagged Tumblr posts
Text
0 notes
Text
0 notes
Note
How much/quickly do you think AI is going to expand and improve materials science? It feels like a scientific field which is already benefiting tremendously.
My initial instinct was yes, MSE is already benefiting tremendously as you said. At least in terms of the fundamental science and research, AI is huge in materials science. So how quickly? I'd say it's already doing so, and it's only going to move quicker from here. But I'm coming at this from the perspective of a metallurgist who works in/around academia at the moment, with the bias that probably more than half of my research group does computational work. So let's take a step back.
So, first, AI. It's... not a great term. So here's what I, specifically, am referring to when I talk about AI in materials science:
Most of the people I know in AI would refer to what they do as machine learning or deep learning, so machine learning tends to be what I use as a preferred term. And as you can see from the above image, it can do a lot. The thing is, on a fundamental level, materials science is all about how our 118 elements (~90, if you want to ignore everything past uranium and a few others that aren't practical to use) interact. That's a lot of combinations. (Yes, yes, we're not getting into the distinction between materials science, chemistry, and physics right now.) If you're trying to make a new alloy that has X properties and Y price, computers are so much better at running through all the options than a human would be. Or if you have 100 images you want to analyze to get grain size—we're getting to the point where computers can do it faster. (The question is, can they do it better? And this question can get complicated fast. What is better? What is the size of the grain? We're not going to get into 'ground truth' debates here though.) Plenty of other examples exist.
Even beyond the science of it all, machine learning can help collect knowledge in one place. That's what the text/literature bubble above means: there are so many old articles that don't have data attached to them, and I know people personally who are working on the problem of training systems to pull data from pdfs (mainly tables and graphs) so that that information can be collated.
I won't ramble too long about the usage of machine learning in MSE because that could get long quickly, and the two sources I'm linking here cover that far better than I could. But I'll give you this plot from research in 2019 (so already 6 years out of date!) about the growth of machine learning in materials science:
I will leave everyone with the caveat though, that when I say machine learning is huge in MSE, I am, as I said in the beginning, referring to fundamental research in the field. From my perspective, in terms of commercial applications we've still got a ways to go before we trust computers to churn out parts for us. Machine learning can tell researchers the five best element combinations to make a new high entropy alloy—but no company is going to commit to making that product until the predictions of the computer (properties, best processing routes, etc.) have been physically demonstrated with actual parts and tested in traditional ways.
Certain computational materials science techniques, like finite element analysis (which is not AI, though might incorporate it in the future) are trusted by industry, but machine learning techniques are not there yet, and still have a ways to go, as far as I'm aware.
So as for how much? Fundamental research for now only. New materials and high-throughput materials testing/characterization. But I do think, at some point, maybe ten years, maybe twenty years down the line, we'll start to see parts made whose processing was entirely informed by machine learning, possibly with feedback and feedforward control so that the finished parts don't need to be tested to know how they'll perform (see: Digital twins (Wikipedia) (Phys.org) (2022 article)). At that point, it's not a matter of whether the technology will be ready for it, it'll be a matter of how much we want to trust the technology. I don't think we'll do away with physical testing anytime soon.
But hey, that's just one perspective. If anyone's got any thoughts about AI in materials science, please, share them!
Source of image 1, 2022 article.
Source of image 2, 2019 article.
#Materials Science#Science#Artificial Intelligence#Replies#Computational materials science#Machine learning
23 notes
·
View notes
Text
Interesting Papers for Week 10, 2025
Simplified internal models in human control of complex objects. Bazzi, S., Stansfield, S., Hogan, N., & Sternad, D. (2024). PLOS Computational Biology, 20(11), e1012599.
Co-contraction embodies uncertainty: An optimal feedforward strategy for robust motor control. Berret, B., Verdel, D., Burdet, E., & Jean, F. (2024). PLOS Computational Biology, 20(11), e1012598.
Distributed representations of behaviour-derived object dimensions in the human visual system. Contier, O., Baker, C. I., & Hebart, M. N. (2024). Nature Human Behaviour, 8(11), 2179–2193.
Thalamic spindles and Up states coordinate cortical and hippocampal co-ripples in humans. Dickey, C. W., Verzhbinsky, I. A., Kajfez, S., Rosen, B. Q., Gonzalez, C. E., Chauvel, P. Y., Cash, S. S., Pati, S., & Halgren, E. (2024). PLOS Biology, 22(11), e3002855.
Preconfigured cortico-thalamic neural dynamics constrain movement-associated thalamic activity. González-Pereyra, P., Sánchez-Lobato, O., Martínez-Montalvo, M. G., Ortega-Romero, D. I., Pérez-Díaz, C. I., Merchant, H., Tellez, L. A., & Rueda-Orozco, P. E. (2024). Nature Communications, 15, 10185.
A tradeoff between efficiency and robustness in the hippocampal-neocortical memory network during human and rodent sleep. Hahn, M. A., Lendner, J. D., Anwander, M., Slama, K. S. J., Knight, R. T., Lin, J. J., & Helfrich, R. F. (2024). Progress in Neurobiology, 242, 102672.
NREM sleep improves behavioral performance by desynchronizing cortical circuits. Kharas, N., Chelaru, M. I., Eagleman, S., Parajuli, A., & Dragoi, V. (2024). Science, 386(6724), 892–897.
Human hippocampus and dorsomedial prefrontal cortex infer and update latent causes during social interaction. Mahmoodi, A., Luo, S., Harbison, C., Piray, P., & Rushworth, M. F. S. (2024). Neuron, 112(22), 3796-3809.e9.
Can compression take place in working memory without a central contribution of long-term memory? Mathy, F., Friedman, O., & Gauvrit, N. (2024). Memory & Cognition, 52(8), 1726–1736.
Offline hippocampal reactivation during dentate spikes supports flexible memory. McHugh, S. B., Lopes-dos-Santos, V., Castelli, M., Gava, G. P., Thompson, S. E., Tam, S. K. E., Hartwich, K., Perry, B., Toth, R., Denison, T., Sharott, A., & Dupret, D. (2024). Neuron, 112(22), 3768-3781.e8.
Reward Bases: A simple mechanism for adaptive acquisition of multiple reward types. Millidge, B., Song, Y., Lak, A., Walton, M. E., & Bogacz, R. (2024). PLOS Computational Biology, 20(11), e1012580.
Hidden state inference requires abstract contextual representations in the ventral hippocampus. Mishchanchuk, K., Gregoriou, G., Qü, A., Kastler, A., Huys, Q. J. M., Wilbrecht, L., & MacAskill, A. F. (2024). Science, 386(6724), 926–932.
Dopamine builds and reveals reward-associated latent behavioral attractors. Naudé, J., Sarazin, M. X. B., Mondoloni, S., Hannesse, B., Vicq, E., Amegandjin, F., Mourot, A., Faure, P., & Delord, B. (2024). Nature Communications, 15, 9825.
Compensation to visual impairments and behavioral plasticity in navigating ants. Schwarz, S., Clement, L., Haalck, L., Risse, B., & Wystrach, A. (2024). Proceedings of the National Academy of Sciences, 121(48), e2410908121.
Replay shapes abstract cognitive maps for efficient social navigation. Son, J.-Y., Vives, M.-L., Bhandari, A., & FeldmanHall, O. (2024). Nature Human Behaviour, 8(11), 2156–2167.
Rapid modulation of striatal cholinergic interneurons and dopamine release by satellite astrocytes. Stedehouder, J., Roberts, B. M., Raina, S., Bossi, S., Liu, A. K. L., Doig, N. M., McGerty, K., Magill, P. J., Parkkinen, L., & Cragg, S. J. (2024). Nature Communications, 15, 10017.
A hierarchical active inference model of spatial alternation tasks and the hippocampal-prefrontal circuit. Van de Maele, T., Dhoedt, B., Verbelen, T., & Pezzulo, G. (2024). Nature Communications, 15, 9892.
Cognitive reserve against Alzheimer’s pathology is linked to brain activity during memory formation. Vockert, N., Machts, J., Kleineidam, L., Nemali, A., Incesoy, E. I., Bernal, J., Schütze, H., Yakupov, R., Peters, O., Gref, D., Schneider, L. S., Preis, L., Priller, J., Spruth, E. J., Altenstein, S., Schneider, A., Fliessbach, K., Wiltfang, J., Rostamzadeh, A., … Ziegler, G. (2024). Nature Communications, 15, 9815.
The human posterior parietal cortices orthogonalize the representation of different streams of information concurrently coded in visual working memory. Xu, Y. (2024). PLOS Biology, 22(11), e3002915.
Challenging the Bayesian confidence hypothesis in perceptual decision-making. Xue, K., Shekhar, M., & Rahnev, D. (2024). Proceedings of the National Academy of Sciences, 121(48), e2410487121.
#neuroscience#science#research#brain science#scientific publications#cognitive science#neurobiology#cognition#psychophysics#neurons#neural computation#neural networks#computational neuroscience
15 notes
·
View notes
Text
Cancer Invasion Signals
Breast cancer organoids (3D lab-grown tissue models) demonstrate a molecule called YAP in the cancer cell promotes collagen fibre alignment remodelling the extracellular matrix which further activates YAP – a feedforward loop which enhances collective cancer cell invasion
Read the published research article here
Video from work by Antoine A. Khalil and colleagues
Center for Molecular Medicine (CMM), University Medical Center Utrecht, Utrecht, The Netherlands
Image originally published with a Creative Commons Attribution 4.0 International (CC BY 4.0)
Published in Nature Communications, June 2024
You can also follow BPoD on Instagram, Twitter and Facebook
16 notes
·
View notes
Text
The Building Blocks of AI : Neural Networks Explained by Julio Herrera Velutini
What is a Neural Network?
A neural network is a computational model inspired by the human brain’s structure and function. It is a key component of artificial intelligence (AI) and machine learning, designed to recognize patterns and make decisions based on data. Neural networks are used in a wide range of applications, including image and speech recognition, natural language processing, and even autonomous systems like self-driving cars.
Structure of a Neural Network
A neural network consists of layers of interconnected nodes, known as neurons. These layers include:
Input Layer: Receives raw data and passes it into the network.
Hidden Layers: Perform complex calculations and transformations on the data.
Output Layer: Produces the final result or prediction.
Each neuron in a layer is connected to neurons in the next layer through weighted connections. These weights determine the importance of input signals, and they are adjusted during training to improve the model’s accuracy.
How Neural Networks Work?
Neural networks learn by processing data through forward propagation and adjusting their weights using backpropagation. This learning process involves:
Forward Propagation: Data moves from the input layer through the hidden layers to the output layer, generating predictions.
Loss Calculation: The difference between predicted and actual values is measured using a loss function.
Backpropagation: The network adjusts weights based on the loss to minimize errors, improving performance over time.
Types of Neural Networks-
Several types of neural networks exist, each suited for specific tasks:
Feedforward Neural Networks (FNN): The simplest type, where data moves in one direction.
Convolutional Neural Networks (CNN): Used for image processing and pattern recognition.
Recurrent Neural Networks (RNN): Designed for sequential data like time-series analysis and language processing.
Generative Adversarial Networks (GANs): Used for generating synthetic data, such as deepfake images.
Conclusion-
Neural networks have revolutionized AI by enabling machines to learn from data and improve performance over time. Their applications continue to expand across industries, making them a fundamental tool in modern technology and innovation.
3 notes
·
View notes
Text
Mastering Neural Networks: A Deep Dive into Combining Technologies
How Can Two Trained Neural Networks Be Combined?
Introduction
In the ever-evolving world of artificial intelligence (AI), neural networks have emerged as a cornerstone technology, driving advancements across various fields. But have you ever wondered how combining two trained neural networks can enhance their performance and capabilities? Let’s dive deep into the fascinating world of neural networks and explore how combining them can open new horizons in AI.
Basics of Neural Networks
What is a Neural Network?
Neural networks, inspired by the human brain, consist of interconnected nodes or "neurons" that work together to process and analyze data. These networks can identify patterns, recognize images, understand speech, and even generate human-like text. Think of them as a complex web of connections where each neuron contributes to the overall decision-making process.
How Neural Networks Work
Neural networks function by receiving inputs, processing them through hidden layers, and producing outputs. They learn from data by adjusting the weights of connections between neurons, thus improving their ability to predict or classify new data. Imagine a neural network as a black box that continuously refines its understanding based on the information it processes.
Types of Neural Networks
From simple feedforward networks to complex convolutional and recurrent networks, neural networks come in various forms, each designed for specific tasks. Feedforward networks are great for straightforward tasks, while convolutional neural networks (CNNs) excel in image recognition, and recurrent neural networks (RNNs) are ideal for sequential data like text or speech.
Why Combine Neural Networks?
Advantages of Combining Neural Networks
Combining neural networks can significantly enhance their performance, accuracy, and generalization capabilities. By leveraging the strengths of different networks, we can create a more robust and versatile model. Think of it as assembling a team where each member brings unique skills to tackle complex problems.
Applications in Real-World Scenarios
In real-world applications, combining neural networks can lead to breakthroughs in fields like healthcare, finance, and autonomous systems. For example, in medical diagnostics, combining networks can improve the accuracy of disease detection, while in finance, it can enhance the prediction of stock market trends.
Methods of Combining Neural Networks
Ensemble Learning
Ensemble learning involves training multiple neural networks and combining their predictions to improve accuracy. This approach reduces the risk of overfitting and enhances the model's generalization capabilities.
Bagging
Bagging, or Bootstrap Aggregating, trains multiple versions of a model on different subsets of the data and combines their predictions. This method is simple yet effective in reducing variance and improving model stability.
Boosting
Boosting focuses on training sequential models, where each model attempts to correct the errors of its predecessor. This iterative process leads to a powerful combined model that performs well even on difficult tasks.
Stacking
Stacking involves training multiple models and using a "meta-learner" to combine their outputs. This technique leverages the strengths of different models, resulting in superior overall performance.
Transfer Learning
Transfer learning is a method where a pre-trained neural network is fine-tuned on a new task. This approach is particularly useful when data is scarce, allowing us to leverage the knowledge acquired from previous tasks.
Concept of Transfer Learning
In transfer learning, a model trained on a large dataset is adapted to a smaller, related task. For instance, a model trained on millions of images can be fine-tuned to recognize specific objects in a new dataset.
How to Implement Transfer Learning
To implement transfer learning, we start with a pretrained model, freeze some layers to retain their knowledge, and fine-tune the remaining layers on the new task. This method saves time and computational resources while achieving impressive results.
Advantages of Transfer Learning
Transfer learning enables quicker training times and improved performance, especially when dealing with limited data. It’s like standing on the shoulders of giants, leveraging the vast knowledge accumulated from previous tasks.
Neural Network Fusion
Neural network fusion involves merging multiple networks into a single, unified model. This method combines the strengths of different architectures to create a more powerful and versatile network.
Definition of Neural Network Fusion
Neural network fusion integrates different networks at various stages, such as combining their outputs or merging their internal layers. This approach can enhance the model's ability to handle diverse tasks and data types.
Types of Neural Network Fusion
There are several types of neural network fusion, including early fusion, where networks are combined at the input level, and late fusion, where their outputs are merged. Each type has its own advantages depending on the task at hand.
Implementing Fusion Techniques
To implement neural network fusion, we can combine the outputs of different networks using techniques like averaging, weighted voting, or more sophisticated methods like learning a fusion model. The choice of technique depends on the specific requirements of the task.
Cascade Network
Cascade networks involve feeding the output of one neural network as input to another. This approach creates a layered structure where each network focuses on different aspects of the task.
What is a Cascade Network?
A cascade network is a hierarchical structure where multiple networks are connected in series. Each network refines the outputs of the previous one, leading to progressively better performance.
Advantages and Applications of Cascade Networks
Cascade networks are particularly useful in complex tasks where different stages of processing are required. For example, in image processing, a cascade network can progressively enhance image quality, leading to more accurate recognition.
Practical Examples
Image Recognition
In image recognition, combining CNNs with ensemble methods can improve accuracy and robustness. For instance, a network trained on general image data can be combined with a network fine-tuned for specific object recognition, leading to superior performance.
Natural Language Processing
In natural language processing (NLP), combining RNNs with transfer learning can enhance the understanding of text. A pre-trained language model can be fine-tuned for specific tasks like sentiment analysis or text generation, resulting in more accurate and nuanced outputs.
Predictive Analytics
In predictive analytics, combining different types of networks can improve the accuracy of predictions. For example, a network trained on historical data can be combined with a network that analyzes real-time data, leading to more accurate forecasts.
Challenges and Solutions
Technical Challenges
Combining neural networks can be technically challenging, requiring careful tuning and integration. Ensuring compatibility between different networks and avoiding overfitting are critical considerations.
Data Challenges
Data-related challenges include ensuring the availability of diverse and high-quality data for training. Managing data complexity and avoiding biases are essential for achieving accurate and reliable results.
Possible Solutions
To overcome these challenges, it’s crucial to adopt a systematic approach to model integration, including careful preprocessing of data and rigorous validation of models. Utilizing advanced tools and frameworks can also facilitate the process.
Tools and Frameworks
Popular Tools for Combining Neural Networks
Tools like TensorFlow, PyTorch, and Keras provide extensive support for combining neural networks. These platforms offer a wide range of functionalities and ease of use, making them ideal for both beginners and experts.
Frameworks to Use
Frameworks like Scikit-learn, Apache MXNet, and Microsoft Cognitive Toolkit offer specialized support for ensemble learning, transfer learning, and neural network fusion. These frameworks provide robust tools for developing and deploying combined neural network models.
Future of Combining Neural Networks
Emerging Trends
Emerging trends in combining neural networks include the use of advanced ensemble techniques, the integration of neural networks with other AI models, and the development of more sophisticated fusion methods.
Potential Developments
Future developments may include the creation of more powerful and efficient neural network architectures, enhanced transfer learning techniques, and the integration of neural networks with other technologies like quantum computing.
Case Studies
Successful Examples in Industry
In healthcare, combining neural networks has led to significant improvements in disease diagnosis and treatment recommendations. For example, combining CNNs with RNNs has enhanced the accuracy of medical image analysis and patient monitoring.
Lessons Learned from Case Studies
Key lessons from successful case studies include the importance of data quality, the need for careful model tuning, and the benefits of leveraging diverse neural network architectures to address complex problems.
Online Course
I have came across over many online courses. But finally found something very great platform to save your time and money.
1.Prag Robotics_ TBridge
2.Coursera
Best Practices
Strategies for Effective Combination
Effective strategies for combining neural networks include using ensemble methods to enhance performance, leveraging transfer learning to save time and resources, and adopting a systematic approach to model integration.
Avoiding Common Pitfalls
Common pitfalls to avoid include overfitting, ignoring data quality, and underestimating the complexity of model integration. By being aware of these challenges, we can develop more robust and effective combined neural network models.
Conclusion
Combining two trained neural networks can significantly enhance their capabilities, leading to more accurate and versatile AI models. Whether through ensemble learning, transfer learning, or neural network fusion, the potential benefits are immense. By adopting the right strategies and tools, we can unlock new possibilities in AI and drive advancements across various fields.
FAQs
What is the easiest method to combine neural networks?
The easiest method is ensemble learning, where multiple models are combined to improve performance and accuracy.
Can different types of neural networks be combined?
Yes, different types of neural networks, such as CNNs and RNNs, can be combined to leverage their unique strengths.
What are the typical challenges in combining neural networks?
Challenges include technical integration, data quality, and avoiding overfitting. Careful planning and validation are essential.
How does combining neural networks enhance performance?
Combining neural networks enhances performance by leveraging diverse models, reducing errors, and improving generalization.
Is combining neural networks beneficial for small datasets?
Yes, combining neural networks can be beneficial for small datasets, especially when using techniques like transfer learning to leverage knowledge from larger datasets.
#artificialintelligence#coding#raspberrypi#iot#stem#programming#science#arduinoproject#engineer#electricalengineering#robotic#robotica#machinelearning#electrical#diy#arduinouno#education#manufacturing#stemeducation#robotics#robot#technology#engineering#robots#arduino#electronics#automation#tech#innovation#ai
4 notes
·
View notes
Text
Sesi development meeting pekan ini topiknya menarik "The Art of Giving Feedback/Feedforward. Dibayangan saya sebelum menyusun materi, sepertinya cocok jika pakai metode story telling. Kembali mengingat bagaimana pengalaman mendapatkan feedback dari beberapa atasan selama bekerja. Ada yang cukup menyakitkan, adapula yang membuat semangat untuk terus bertumbuh dan belajar.
Dulu, saya berpikir feedback itu tidak baik, menyakitkan. Ternyata asumsi saya salah, feedback bisa menjadi seperti itu jika cara penyampainnya salah. Nyatanya belakangan feedback menjadi rutinitas harian di dunia kerja.
Tidak ada pengalaman yang tidak berharga, baik atau buruk semuanya punya sisi mendalam jika dimaknai. Kisah mendapatkan feedbak yang tidak menyenangkan menjadi bahan belajar agar ketika memberikan feedbak ke team tidak diperlakukan seperti itu, sebaliknya feedbak yang membuat semangat untuk bertumbuh dan belajar menjadi contoh ketika memberikan feedback kepada team.
Rasulullah adalah teladan terbaik bagaimana beliau memberikan feedback. Suatu hari, Umar bin Khattab RA datang kepada Rasulullah dengan perasaan marah dan ingin menghukum seorang wanita yang dianggapnya telah melakukan kesalahan. Rasulullah dengan lembutnya mendengarkan keluh kesah Umar tanpa mengomentari secara langsung. Namun, Rasulullah menggunakan kesempatan tersebut untuk memberikan feedback kepada Umar.
Rasulullah dengan bijaksana dan penuh kasih mengarahkan Umar untuk menenangkan diri dan memahami situasi dengan lebih baik. Beliau memberikan nasihat kepada Umar tentang pentingnya memahami dan mempertimbangkan dengan baik sebelum mengambil keputusan yang penting. Beliau tidak langsung mengkritik atau menyalahkan, tetapi dengan penuh kesabaran dan empati membimbing Umar untuk memahami situasi dengan lebih baik dan mengambil keputusan yang lebih bijaksana.
Semoga menjadi ruang belajar dalam proses kedepannya. Aamiin
Rabu, 27 Maret 2024 / 16 Ramadhan 1445 H
5 notes
·
View notes
Text
Autism and The Predictive Brain: Absolute Thinking in a Relative World (Peter Vermeulen, 2022)
"Your brain is not a fan of (too many) surprises.
Instead, it prefers to deal as economically as possible with the energy management of the body for which it is responsible, which means not wasting any effort on information that is not necessary for our effective functioning and survival.
As a result, it blocks out anything that it doesn’t need. It only lets through what is essential.
In this way, for example, Japanese people cannot hear the difference between the letter R and the letter L, because that is not a useful distinction in their own language. (…)
In short, the whole idea of the computer metaphor – input - processing - output – simply does not hold water.
So how do things really work?
The basic idea is simple: the brain does not like surprises and therefore wants to anticipate what will happen to the maximum possible extent.
For this reason, the brain does not wait for the senses to provide information about the outside world, but prefers instead to make predictions about that world.
This is something that has only recently been discovered by brain scientists working at the start of the 21st century.
What makes this discovery so Copernican and therefore so revolutionary is the conclusion that perception does not begin as a result of a stimulus in the exterior world, but actually starts inside your head, in your own brain. (…)
When you see a piece of chocolate, you already know (unconsciously, without thinking) what it will taste like before you pick it up and pop it into your mouth.
The brain does not ask the senses for new information or input, but wants feedback on the information that it already has about the world.
With this in mind, your brain will check to see whether the texture and flavour of the chocolate matches your (and its) expectations.
In other words, what we used to call the feedforward of information (the bottom-up flow) is actually feedback, and vice versa!
The brain uses the senses to check the continuing usefulness and survival value of its own predictions about the world.
In neuroscience, this is known as the theory of the predictive brain or predictive coding."
12 notes
·
View notes
Text
I've got a people at the industrial STEM research division I work at hyping up generative AI for all kinds of modeling prediction stuff and I'm just like.
You just described a clustering algorithm. JMP does that, we have the license.
You just described a discriminant algorithm. JMP does that, we have the license.
Ohh, you just described a sparse data discriminant algorithm. That's actually a machine learning problem, and by machine learning I mean a support vector machine, the least machine learniest ML algorithm. And JMP does that too, we have the license.
Now you just described a feedforward neural network for iterative sparse data prediction. That's definitely machine learning, but not any flavor of genAI and also. Yes. JMP does that, we have the license.
You just described a database search function. Those have been perfected for years, but somehow we can't fund IT well enough to get a working one. I cannot imagine why you think we need genAI to do this, or why we'd be able to get genAI to do this when an old reliable perfected algorithm is beyond us.
... and you are a manager gushing over a shitty faked image we could have done better and cheaper with a stock photo subscription or a concept artist on staff. Go away.
so like I said, I work in the tech industry, and it's been kind of fascinating watching whole new taboos develop at work around this genAI stuff. All we do is talk about genAI, everything is genAI now, "we have to win the AI race," blah blah blah, but nobody asks - you can't ask -
What's it for?
What's it for?
Why would anyone want this?
I sit in so many meetings and listen to genuinely very intelligent people talk until steam is rising off their skulls about genAI, and wonder how fast I'd get fired if I asked: do real people actually want this product, or are the only people excited about this technology the shareholders who want to see lines go up?
like you realize this is a bubble, right, guys? because nobody actually needs this? because it's not actually very good? normal people are excited by the novelty of it, and finance bro capitalists are wetting their shorts about it because they want to get rich quick off of the Next Big Thing In Tech, but the novelty will wear off and the bros will move on to something else and we'll just be left with billions and billions of dollars invested in technology that nobody wants.
and I don't say it, because I need my job. And I wonder how many other people sitting at the same table, in the same meeting, are also not saying it, because they need their jobs.
idk man it's just become a really weird environment.
33K notes
·
View notes
Text
Next-Gen Neural Networks: Revolutionising AI Architecture
Neural networks are the backbone of modern artificial intelligence, and they're undergoing a quiet revolution. From transformers and graph neural networks (GNNs) to spiking neural networks and neuromorphic computing, next-gen architectures are reshaping how AI learns, adapts, and interacts with the world. These innovations aren't just about boosting performance—they're unlocking new applications and making AI more efficient, scalable, and human-like.
Traditional feedforward and convolutional neural networks (CNNs) were game-changers, but they have limitations, especially when dealing with sequential data, reasoning tasks, or long-term dependencies. Enter transformers—pioneered by models like BERT and GPT—which have revolutionized natural language processing by enabling AI to grasp context at scale. Their self-attention mechanisms allow machines to understand nuance, tone, and relationships across large blocks of text or time series data.
Meanwhile, graph neural networks are enabling AI to better process structured, relational data such as social networks, molecular structures, or logistics networks. By modeling data as nodes and edges, GNNs provide deeper insight into how elements interact—something traditional neural networks struggle with. This advancement opens up new frontiers in drug discovery, fraud detection, and recommendation systems.
Another major leap is occurring with neuromorphic computing and spiking neural networks, which mimic the way the human brain processes information. These architectures are energy-efficient, event-driven, and well-suited for edge AI applications like robotics and IoT. They promise not only faster and more power-conscious AI but also systems that respond more naturally to stimuli.
The evolution of neural networks is also making AI more accessible and deployable. With innovations like model compression, few-shot learning, and modular architectures, it's now possible to run powerful AI models on smartphones, embedded systems, and even wearables. This shift is democratizing AI and embedding intelligence directly into the world around us.
To take full advantage of next-gen neural networks, organizations often need expert guidance to navigate the complexity. Professional AI and ML development services can help businesses implement cutting-edge architectures, optimize training processes, and tailor solutions to specific use cases. These partners bring the technical depth required to turn emerging research into real-world applications.
With rapid advancement comes responsibility. As neural networks become more powerful, ensuring transparency, fairness, and interpretability remains a priority. Researchers and developers must design architectures that not only deliver results but also align with ethical principles and societal needs.
The future of AI is being shaped at the architectural level. By embracing next-gen neural networks, we're not just improving models—we're redefining what AI is capable of. From smarter cities to more intuitive assistants and life-saving diagnostics, these technologies are laying the groundwork for the next era of intelligent systems.
#NeuralNetworks #Transformers #GraphNeural
0 notes
Text
A Technical and Business Perspective for Choosing the Right LLM for Enterprise Applications.
In 2025, Large Language Models (LLMs) have emerged as pivotal assets for enterprise digital transformation, powering over 65% of AI-driven initiatives. From automating customer support to enhancing content generation and decision-making processes, LLMs have become indispensable. Yet, despite the widespread adoption, approximately 46% of AI proofs-of-concept were abandoned and 42% of enterprise AI projects were discontinued, mainly due to challenges around cost, data privacy, and security. A recurring pattern identified is the lack of due diligence in selecting the right LLM tailored to specific enterprise needs. Many organizations adopt popular models without evaluating critical factors such as model architecture, operational feasibility, data protection, and long-term costs. Enterprises that invested time in technically aligning LLMs with their business workflows, however, have reported significant outcomes — including a 40% drop in operational costs and up to a 35% boost in process efficiency.
LLMs are rooted in the Transformer architecture, which revolutionized NLP through self-attention mechanisms and parallel processing capabilities. Components such as Multi-Head Self-Attention (MHSA), Feedforward Neural Networks (FFNs), and advanced positional encoding methods (like RoPE and Alibi) are essential to how LLMs understand and generate language. In 2025, newer innovations such as FlashAttention-2 and Sparse Attention have improved speed and memory efficiency, while the adoption of Mixture of Experts (MoE) and Conditional Computation has optimized performance for complex tasks. Tokenization techniques like BPE, Unigram LM, and DeepSeek Adaptive Tokenization help break down language into machine-understandable tokens. Training strategies have also evolved. While unsupervised pretraining using Causal Language Modeling and Masked Language Modeling remains fundamental, newer approaches like Progressive Layer Training and Synthetic Data Augmentation are gaining momentum. Fine-tuning has become more cost-efficient with Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA, QLoRA, and Prefix-Tuning. Additionally, Reinforcement Learning with Human Feedback (RLHF) is now complemented by Direct Preference Optimization (DPO) and Contrastive RLHF to better align model behavior with human intent.
From a deployment perspective, efficient inference is crucial. Enterprises are adopting quantization techniques like GPTQ and SmoothQuant, as well as memory-saving architectures like xFormers, to manage computational loads at scale. Sparse computation and Gated Experts further enhance processing by activating only the most relevant neural pathways. Retrieval-Augmented Generation (RAG) has enabled LLMs to respond in real-time with context-aware insights by integrating external knowledge sources. Meanwhile, the industry focus on data security and privacy has intensified. Technologies like Federated Learning, Differential Privacy, and Secure Multi-Party Computation (SMPC) are becoming essential for protecting sensitive information. Enterprises are increasingly weighing the pros and cons of cloud-based vs. on-prem LLMs. While cloud LLMs like GPT-5 and Gemini Ultra 2 offer scalability and multimodal capabilities, they pose higher privacy risks. On-prem models like Llama 3, Falcon 2, and DeepSeek ensure greater data sovereignty, making them ideal for sensitive and regulated sectors.
Comparative evaluations show that different LLMs shine in different use cases. GPT-5 excels in customer service and complex document processing, while Claude 3 offers superior ethics and privacy alignment. DeepSeek and Llama 3 are well-suited for multilingual tasks and on-premise deployment, respectively. Models like Custom ai Gemini Ultra 2 and DeepSeek-Vision demonstrate impressive multimodal capabilities, making them suitable for industries needing text, image, and video processing. With careful evaluation of technical and operational parameters — such as accuracy, inference cost, deployment strategy, scalability, and compliance — enterprises can strategically choose the right LLM that fits their business needs. A one-size-fits-all approach does not work in LLM adoption. Organizations must align model capabilities with their core objectives and regulatory requirements to fully unlock the transformative power of LLMs in 2025 and beyond.
0 notes
Text
CSE131 - Homework #12 Solved
Objective : Learn how to use file input and output, strings, and functions. 12. Convolutional Neural Networks (CNNs) are a type of feedforward neural network that excels in machine learning tasks involving large images. A critical component of CNNs is the 2-D Convolution layer, which uses the weights in the kernel to perform convolution operations with the data input from the previous layer, and…
0 notes
Text
Interesting Papers for Week 19, 2025
Individual-specific strategies inform category learning. Collina, J. S., Erdil, G., Xia, M., Angeloni, C. F., Wood, K. C., Sheth, J., Kording, K. P., Cohen, Y. E., & Geffen, M. N. (2025). Scientific Reports, 15, 2984.
Visual activity enhances neuronal excitability in thalamic relay neurons. Duménieu, M., Fronzaroli-Molinieres, L., Naudin, L., Iborra-Bonnaure, C., Wakade, A., Zanin, E., Aziz, A., Ankri, N., Incontro, S., Denis, D., Marquèze-Pouey, B., Brette, R., Debanne, D., & Russier, M. (2025). Science Advances, 11(4).
The functional role of oscillatory dynamics in neocortical circuits: A computational perspective. Effenberger, F., Carvalho, P., Dubinin, I., & Singer, W. (2025). Proceedings of the National Academy of Sciences, 122(4), e2412830122.
Expert navigators deploy rational complexity–based decision precaching for large-scale real-world planning. Fernandez Velasco, P., Griesbauer, E.-M., Brunec, I. K., Morley, J., Manley, E., McNamee, D. C., & Spiers, H. J. (2025). Proceedings of the National Academy of Sciences, 122(4), e2407814122.
Basal ganglia components have distinct computational roles in decision-making dynamics under conflict and uncertainty. Ging-Jehli, N. R., Cavanagh, J. F., Ahn, M., Segar, D. J., Asaad, W. F., & Frank, M. J. (2025). PLOS Biology, 23(1), e3002978.
Hippocampal Lesions in Male Rats Produce Retrograde Memory Loss for Over‐Trained Spatial Memory but Do Not Impact Appetitive‐Contextual Memory: Implications for Theories of Memory Organization in the Mammalian Brain. Hong, N. S., Lee, J. Q., Bonifacio, C. J. T., Gibb, M. J., Kent, M., Nixon, A., Panjwani, M., Robinson, D., Rusnak, V., Trudel, T., Vos, J., & McDonald, R. J. (2025). Journal of Neuroscience Research, 103(1).
Sensory experience controls dendritic structure and behavior by distinct pathways involving degenerins. Inberg, S., Iosilevskii, Y., Calatayud-Sanchez, A., Setty, H., Oren-Suissa, M., Krieg, M., & Podbilewicz, B. (2025). eLife, 14, e83973.
Distributed representations of temporally accumulated reward prediction errors in the mouse cortex. Makino, H., & Suhaimi, A. (2025). Science Advances, 11(4).
Adaptation optimizes sensory encoding for future stimuli. Mao, J., Rothkopf, C. A., & Stocker, A. A. (2025). PLOS Computational Biology, 21(1), e1012746.
Memory load influences our preparedness to act on visual representations in working memory without affecting their accessibility. Nasrawi, R., Mautner-Rohde, M., & van Ede, F. (2025). Progress in Neurobiology, 245, 102717.
Layer-specific control of inhibition by NDNF interneurons. Naumann, L. B., Hert��g, L., Müller, J., Letzkus, J. J., & Sprekeler, H. (2025). Proceedings of the National Academy of Sciences, 122(4), e2408966122.
Multisensory integration operates on correlated input from unimodal transient channels. Parise, C. V, & Ernst, M. O. (2025). eLife, 12, e90841.3.
Random noise promotes slow heterogeneous synaptic dynamics important for robust working memory computation. Rungratsameetaweemana, N., Kim, R., Chotibut, T., & Sejnowski, T. J. (2025). Proceedings of the National Academy of Sciences, 122(3), e2316745122.
Discriminating neural ensemble patterns through dendritic computations in randomly connected feedforward networks. Somashekar, B. P., & Bhalla, U. S. (2025). eLife, 13, e100664.4.
Effects of noise and metabolic cost on cortical task representations. Stroud, J. P., Wojcik, M., Jensen, K. T., Kusunoki, M., Kadohisa, M., Buckley, M. J., Duncan, J., Stokes, M. G., & Lengyel, M. (2025). eLife, 13, e94961.2.
Representational geometry explains puzzling error distributions in behavioral tasks. Wei, X.-X., & Woodford, M. (2025). Proceedings of the National Academy of Sciences, 122(4), e2407540122.
Deficiency of orexin receptor type 1 in dopaminergic neurons increases novelty-induced locomotion and exploration. Xiao, X., Yeghiazaryan, G., Eggersmann, F., Cremer, A. L., Backes, H., Kloppenburg, P., & Hausen, A. C. (2025). eLife, 12, e91716.4.
Endopiriform neurons projecting to ventral CA1 are a critical node for recognition memory. Yamawaki, N., Login, H., Feld-Jakobsen, S. Ø., Molnar, B. M., Kirkegaard, M. Z., Moltesen, M., Okrasa, A., Radulovic, J., & Tanimura, A. (2025). eLife, 13, e99642.4.
Cost-benefit tradeoff mediates the transition from rule-based to memory-based processing during practice. Yang, G., & Jiang, J. (2025). PLOS Biology, 23(1), e3002987.
Identification of the subventricular tegmental nucleus as brainstem reward center. Zichó, K., Balog, B. Z., Sebestény, R. Z., Brunner, J., Takács, V., Barth, A. M., Seng, C., Orosz, Á., Aliczki, M., Sebők, H., Mikics, E., Földy, C., Szabadics, J., & Nyiri, G. (2025). Science, 387(6732).
#neuroscience#science#research#brain science#scientific publications#cognitive science#neurobiology#cognition#psychophysics#neurons#neural computation#neural networks#computational neuroscience
10 notes
·
View notes
Text
From Feedback to Feedforward: Using AI-Powered Assessment Flywheel to Drive Student Competency
See on Scoop.it - Education 2.0 & 3.0
Discover how Generative AI can enhance student competency through the assessment flywheel model, providing personalized feedforward comments and automating low-stakes formative assessments for deeper learning.
0 notes
Text
LLM Development: How to Build a Powerful Large Language Model from Scratch
Large Language Models (LLMs) have revolutionized the field of artificial intelligence, enabling sophisticated applications in natural language processing (NLP), chatbots, content generation, and more. These models, such as OpenAI's GPT series and Google's PaLM, leverage billions of parameters to process and generate human-like text. However, developing an LLM from scratch is a challenging endeavor requiring deep technical expertise, massive computational resources, and a robust dataset.
In this guide, we will explore the step-by-step process of building a powerful LLM from scratch, covering everything from the fundamental concepts to deployment and scaling. Whether you're a researcher, AI enthusiast, or an industry expert looking to understand LLM development, this article will provide in-depth insights into the entire lifecycle of an LLM.
Understanding the Fundamentals of LLMs
Before diving into the development process, it is crucial to understand what makes an LLM powerful and how it differs from traditional NLP models.
What Makes a Model "Large"?
LLMs are characterized by their vast number of parameters, which define the complexity and depth of the neural network. Some of the key factors that contribute to an LLM’s capabilities include:
Number of Parameters: Models like GPT-4 have hundreds of billions of parameters, making them highly sophisticated in generating contextually relevant text.
Training Data: The quality and diversity of the training dataset play a significant role in the model's accuracy and generalizability.
Computational Power: Training LLMs requires high-performance GPUs or TPUs, as well as distributed computing resources.
Scalability: Large models require distributed architectures to efficiently process and train vast datasets.
Key Architectures in LLMs
At the heart of LLMs lies the Transformer architecture, which revolutionized NLP by introducing self-attention mechanisms. The key components include:
Self-Attention Mechanism: Allows the model to focus on relevant words within a sentence, improving coherence.
Token Embeddings: Converts words into numerical representations for processing.
Positional Encoding: Retains the sequence order of words in a sentence.
Feedforward Layers: Responsible for processing the attention-weighted input and making predictions.
Setting Up the Development Environment
Developing an LLM requires a robust setup, including hardware, software, and infrastructure considerations.
Hardware Requirements
High-Performance GPUs/TPUs: LLMs require extensive parallel processing. NVIDIA A100, H100, or Google's TPUs are commonly used.
Cloud-Based Solutions: Services like AWS, Google Cloud, and Microsoft Azure provide scalable infrastructure for LLM training.
Storage Considerations: Training data and model checkpoints require large storage capacities, often measured in terabytes.
Essential Software Frameworks
PyTorch: A popular deep learning framework used for building LLMs.
TensorFlow: Offers high scalability for training deep learning models.
JAX: Optimized for high-performance computing and auto-differentiation.
DeepSpeed & FSDP: Libraries that optimize training efficiency by enabling memory-efficient model parallelism.
Choosing the Right Dataset
Common Crawl: A vast repository of web pages useful for language modeling.
Wikipedia & BooksCorpus: Ideal for training general-purpose NLP models.
Domain-Specific Data: Tailored datasets for specialized applications (e.g., medical or financial text).
Synthetic Data Generation: Using smaller models to create high-quality synthetic text data.
Data Collection and Preprocessing
Sourcing High-Quality Data
A well-trained LLM relies on diverse and high-quality datasets. It is important to balance publicly available data with domain-specific text for improved performance.
Data Cleaning and Tokenization
Removing Duplicates and Noise: Ensuring only high-quality text is used.
Tokenization: Splitting text into smaller components (subwords, words, or characters) to enhance model efficiency.
Handling Bias: Implementing techniques to reduce biases in training data and ensure ethical AI development.
Normalization: Converting text into a standardized format to avoid inconsistencies.
Model Architecture and Training
Designing the Neural Network
Building an LLM involves stacking multiple Transformer layers. Each layer processes input data through self-attention and feedforward networks.
Training Strategies
Supervised Learning: Training on labeled data with specific input-output pairs.
Unsupervised Learning: Exposing the model to large-scale text without predefined labels.
Self-Supervised Learning: Using the model’s own predictions as pseudo-labels to improve learning.
Fine-Tuning and Transfer Learning
Pretraining: Training a base model on vast text corpora.
Fine-Tuning: Adapting the model to specific tasks (e.g., chatbot applications or medical text analysis).
Adapter Layers: Using modular layers to efficiently fine-tune large-scale models.
Optimizing Performance and Efficiency
Training LLMs is computationally expensive, making optimization essential.
Reducing Computational Costs
Quantization: Compressing the model while maintaining performance.
Distillation: Training smaller models using the knowledge of larger models.
Sparse Activation: Activating only relevant parts of the model to optimize computation.
Distributed Training
Data Parallelism: Splitting data across multiple GPUs/TPUs.
Model Parallelism: Splitting the model itself across different processing units.
Pipeline Parallelism: Dividing layers across multiple devices to maximize efficiency.
Hyperparameter Tuning
Learning Rate Schedules: Adjusting the learning rate dynamically for optimal convergence.
Batch Size Optimization: Balancing memory usage and training stability.
Gradient Accumulation: Reducing memory load by updating gradients less frequently.
Deployment and Scaling
Hosting Options
On-Premise Deployment: Offers complete control but requires substantial infrastructure.
Cloud-Based Deployment: Scalable and accessible via APIs (e.g., OpenAI API, Hugging Face Inference).
API Integration
RESTful APIs: Allow seamless integration into applications.
Inference Optimization: Techniques like caching and batch processing improve response times.
Edge Deployment: Running models on edge devices for faster inference.
Security and Privacy Considerations
Data Anonymization: Protecting user information in training data.
Access Control Mechanisms: Preventing unauthorized access to APIs and model endpoints.
Federated Learning: Allowing decentralized training while preserving user privacy.
Conclusion
Building a powerful LLM from scratch is a complex yet rewarding challenge that requires expertise in deep learning, data engineering, and computational optimization. While large-scale organizations invest heavily in developing proprietary models, advancements in open-source frameworks and cloud-based AI solutions have made LLM development more accessible.
For aspiring AI developers, starting with smaller-scale models and leveraging pre-trained LLMs can be a practical approach before venturing into full-scale development. By understanding the key aspects covered in this guide, you can embark on the journey of creating your own LLM and contributing to the ever-evolving field of AI-driven language understanding. As AI technology continues to advance, the potential applications of LLMs will only expand, making it an exciting and vital area of research and development.
#ai#blockchain#crypto#cryptocurrency#dex#ai generated#blockchain app factory#ico#blockchainappfactory
0 notes