Aray Karjauv

Posts

Mar 31, 2023
Hands-on Guide to Multi-Language Speech Recognition and Speaker diarization
Multi-Language speech recognition and speaker diarization are two important tasks in the field of audio processing. Speech recognition can be defined as the process of converting spoken language into written text, while speaker diarization involves segmenting an audio recording and assigning each segment to a particular speaker. These techniques are used in a variety of applications, including podcasts and conference transcription.

In this blog post, you will learn how to build a pipeline for multi-language speech recognition and speaker diarization using existing libraries.
Jan 6, 2023
Turning StyleGAN into a latent feature extractor
While Generative Adversarial Networks (GANs) are primarily known for their ability to generate high-quality synthetic images, their main task is to learn a latent feature representation of real data. In addition, recent improvements to the original GAN allow it to learn a disentangled latent representation, enabling us to obtain semantically meaningful embeddings.

This property could possibly allow GANs to be used as high-level feature extractors. However, the problem is that the original GAN architecture is not invertible or, in other words, it is impossible to project real images into the latent space.

This article addresses this issue and attempts to answer whether GANs can extract meaningful features from real images and if they are suitable for downstream tasks.
Jul 12, 2022
Bringing Python to the Web
Have you ever wanted to share your cool Python app with the world without deploying an entire Django server or developing a mobile app just for a small project?

Good news, you don’t have to! All you need is to add one JavaScript library to your HTML page and it will even work on mobile devices, allowing you to mix JS with Python so you can take advantage of both worlds.

Posts

Hands-on Guide to Multi-Language Speech Recognition and Speaker diarization

Turning StyleGAN into a latent feature extractor

Bringing Python to the Web