๐ง๐ต๐ถ๐ ๐ถ๐ ๐ต๐ฎ๐ป๐ฑ๐ ๐ฑ๐ผ๐๐ป ๐ผ๐ป๐ฒ ๐ผ๐ณ ๐๐ต๐ฒ ๐๐๐ฆ๐ง ๐๐ถ๐๐๐ฎ๐น๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐ผ๐ณ ๐ต๐ผ๐ ๐๐๐ ๐ ๐ฎ๐ฐ๐๐๐ฎ๐น๐น๐ ๐๐ผ๐ฟ๐ธ. | Andreas Horn
๐ง๐ต๐ถ๐ ๐ถ๐ ๐ต๐ฎ๐ป๐ฑ๐ ๐ฑ๐ผ๐๐ป ๐ผ๐ป๐ฒ ๐ผ๐ณ ๐๐ต๐ฒ ๐๐๐ฆ๐ง ๐๐ถ๐๐๐ฎ๐น๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐ผ๐ณ ๐ต๐ผ๐ ๐๐๐ ๐ ๐ฎ๐ฐ๐๐๐ฎ๐น๐น๐ ๐๐ผ๐ฟ๐ธ. โฌ๏ธ
๐๐ฆ๐ต'๐ด ๐ฃ๐ณ๐ฆ๐ข๐ฌ ๐ช๐ต ๐ฅ๐ฐ๐ธ๐ฏ:
๐ง๐ผ๐ธ๐ฒ๐ป๐ถ๐๐ฎ๐๐ถ๐ผ๐ป & ๐๐บ๐ฏ๐ฒ๐ฑ๐ฑ๐ถ๐ป๐ด๐:
- Input text is broken into tokens (smaller chunks).
- Each token is mapped to a vector in high-dimensional space, where words with similar meanings cluster together.
๐ง๐ต๐ฒ ๐๐๐๐ฒ๐ป๐๐ถ๐ผ๐ป ๐ ๐ฒ๐ฐ๐ต๐ฎ๐ป๐ถ๐๐บ (๐ฆ๐ฒ๐น๐ณ-๐๐๐๐ฒ๐ป๐๐ถ๐ผ๐ป):
- Words influence each other based on context โ ensuring "bank" in riverbank isnโt confused with financial bank.
- The Attention Block weighs relationships between words, refining their representations dynamically.
๐๐ฒ๐ฒ๐ฑ-๐๐ผ๐ฟ๐๐ฎ๐ฟ๐ฑ ๐๐ฎ๐๐ฒ๐ฟ๐ (๐๐ฒ๐ฒ๐ฝ ๐ก๐ฒ๐๐ฟ๐ฎ๐น ๐ก๐ฒ๐๐๐ผ๐ฟ๐ธ ๐ฃ๐ฟ๐ผ๐ฐ๐ฒ๐๐๐ถ๐ป๐ด)
- After attention, tokens pass through multiple feed-forward layers that refine meaning.
- Each layer learns deeper semantic relationships, improving predictions.
๐๐๐ฒ๐ฟ๐ฎ๐๐ถ๐ผ๐ป & ๐๐ฒ๐ฒ๐ฝ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด
- This process repeats through dozens or even hundreds of layers, adjusting token meanings iteratively.
- This is where the "deep" in deep learning comes in โ layers upon layers of matrix multiplications and optimizations.
๐ฃ๐ฟ๐ฒ๐ฑ๐ถ๐ฐ๐๐ถ๐ผ๐ป & ๐ฆ๐ฎ๐บ๐ฝ๐น๐ถ๐ป๐ด
- The final vector representation is used to predict the next word as a probability distribution.
- The model samples from this distribution, generating text word by word.
๐ง๐ต๐ฒ๐๐ฒ ๐บ๐ฒ๐ฐ๐ต๐ฎ๐ป๐ถ๐ฐ๐ ๐ฎ๐ฟ๐ฒ ๐ฎ๐ ๐๐ต๐ฒ ๐ฐ๐ผ๐ฟ๐ฒ ๐ผ๐ณ ๐ฎ๐น๐น ๐๐๐ ๐ (๐ฒ.๐ด. ๐๐ต๐ฎ๐๐๐ฃ๐ง). ๐๐ ๐ถ๐ ๐ฐ๐ฟ๐๐ฐ๐ถ๐ฎ๐น ๐๐ผ ๐ต๐ฎ๐๐ฒ ๐ฎ ๐๐ผ๐น๐ถ๐ฑ ๐๐ป๐ฑ๐ฒ๐ฟ๐๐๐ฎ๐ป๐ฑ๐ถ๐ป๐ด ๐ต๐ผ๐ ๐๐ต๐ฒ๐๐ฒ ๐บ๐ฒ๐ฐ๐ต๐ฎ๐ป๐ถ๐ฐ๐ ๐๐ผ๐ฟ๐ธ ๐ถ๐ณ ๐๐ผ๐ ๐๐ฎ๐ป๐ ๐๐ผ ๐ฏ๐๐ถ๐น๐ฑ ๐๐ฐ๐ฎ๐น๐ฎ๐ฏ๐น๐ฒ, ๐ฟ๐ฒ๐๐ฝ๐ผ๐ป๐๐ถ๐ฏ๐น๐ฒ ๐๐ ๐๐ผ๐น๐๐๐ถ๐ผ๐ป๐.
Here is the full video from 3Blue1Brown with exaplantion. I highly recommend to read, watch and bookmark this for a further deep dive: https://lnkd.in/dAviqK_6
๐ ๐ฒ๐
๐ฝ๐น๐ผ๐ฟ๐ฒ ๐๐ต๐ฒ๐๐ฒ ๐ฑ๐ฒ๐๐ฒ๐น๐ผ๐ฝ๐บ๐ฒ๐ป๐๐ โ ๐ฎ๐ป๐ฑ ๐๐ต๐ฎ๐ ๐๐ต๐ฒ๐ ๐บ๐ฒ๐ฎ๐ป ๐ณ๐ผ๐ฟ ๐ฟ๐ฒ๐ฎ๐น-๐๐ผ๐ฟ๐น๐ฑ ๐๐๐ฒ ๐ฐ๐ฎ๐๐ฒ๐ โ ๐ถ๐ป ๐บ๐ ๐๐ฒ๐ฒ๐ธ๐น๐ ๐ป๐ฒ๐๐๐น๐ฒ๐๐๐ฒ๐ฟ. ๐ฌ๐ผ๐ ๐ฐ๐ฎ๐ป ๐๐๐ฏ๐๐ฐ๐ฟ๐ถ๐ฏ๐ฒ ๐ต๐ฒ๐ฟ๐ฒ ๐ณ๐ผ๐ฟ ๐ณ๐ฟ๐ฒ๐ฒ: https://lnkd.in/dbf74Y9E | 48 comments on LinkedIn