### PyTorch

### Head Attention

The Attention mechanism is currently found in a variety of architectures and tasks (translation, text generation, image annotation). The Attention algorithm analyzes and highlights the relationships between the elements of the input and output sequences. Known as the generalized attention mechanism, it was originally suggested for machine translation models which use recurrent networks. Attention solved the problem of long-term memory in the translation of long sentences. This approach excelled the previously overviewed recurrent neural networks based on LSTM blocks.

### How to Teach the Computer to Think. Lesson 5

Let's proceed with our seminars dedicated to machine learning. Today's lecture is about dynamics. We will also touch upon the inference algorithm and take a closer look at the Merge module. We are going to create multiple models of the world, from trivial to quite peculiar, and see how original a problem solution might be.

### How to Teach the Computer to Think. Lesson 4

As in previous seminars, today we are discussing the problem of building an intelligent system capable of understanding information about a limited world. The system will now also feature basic actions.

Let's move away from the box world for a while. Our AI gradually comes to understand the relations between actions (to approach, to take something, to put it somewhere), and in this space of actions and relations, we continue teaching it to build adequate models which will be plausible in our limited world.

### How to Teach the Computer to Think. Lesson 3

Following our seminars on machine learning, we will first revise the previous lesson, and then look into the theoretical part. Our today's objective is to examine the relationships that describe conditionally closed cubes in the ordinary world, to discuss the concepts of closed and open world and to outline ways to go beyond this rather limited model to more realistic tasks.

### How to Teach the Computer to Think. Lesson 2

In the second video of our series of seminars on teaching the computer to think, we are discussing the basic static aspects of the logical approach. We will also focus on topological aspects, leaving dimensions and distances for later.

The seminar consists of two parts. First, let's recall the mathematical logic, and then make a detailed analysis of the axiomatics of the world of cubes, which is the basic model for mathematical inference and construction of axioms.

### How to Teach the Computer to Think. Lesson 1

In this course of seminars, we are going to consider a rather specific task, which is building an intelligent system able to understand the information about a limited world. By "limited", we only mean that the computer is not supposed to comprehend everything from stock reports to baseball rules straightway. It can be quite an ordinary world, as our house, its neighborhood, the world of simple human interactions.

### Word2Vec Semantics and Technology

Word2vec is a method to efficiently create embeddings developed in 2013. Apart of word-embedding, some of its concepts have also proved effective in developing recommendation engines and data interpretation, even in regards of commercial, non-language tasks. You can see that all modern NLP applications are based on Word2vec algorithms. Today we are looking into Word2Vec technology along with the methods for knowledge representation in intelligent systems.

### Keras. Convolutional Neural Network

A convolutional neural network (also CNN or ConvNet) is one of the most common deep learning algorithms. It's a class of machine learning in which a model learns to classify objects directly in images, video, audio or text.

Today we will find out what features make convolutional neural networks so useful. In the practical part, we will train them to understand geometric relations and test them.

### Support Vector Machine (SVM)

Today's seminar is dedicated to one of the popular supervised learning methods, which is used to solve classification and regression problems. The algorithm, also known as the maximum-margin classifier method, is widely used for solving both linear and nonlinear problems.

Its main idea is to create an optimally separating hyperplane for the sampled objects.