Guias

George Hotz | Programming | ChatLLaMA: get in losers we’re building a chatbot | $1499 comma three








Date of stream 09 Mar 2023.
from $1499 buy
Live-stream chat added as Subtitles/CC – English (Twitch Chat) – three-dot menu icon – Show transcript

Source files:



Follow for notifications:

Support George:

Programming playlist:

Chapters:
00:00:00 muted intro
00:00:50 missing the rant about bad mood
00:01:35 not family friendly content
00:01:55 hardware to run tinygrad
00:02:10 don’t mess with george
00:02:35 what we did yesterday
00:05:00 where are we wasting time, not limited by gpu
00:06:20 python snakeprof
00:10:30 tinygrad discord llama channel, hacker news blow up
00:10:45 llama weights on torrent
00:11:20 does it run on nvidia 3090, ram
00:13:30 open assistant product, prompt engineering
00:21:10 no limits, don’t believe in ai safety
00:21:40 no guns, ai stupid
00:23:55 what you need to torrent for weights
00:27:25 torch.triu
00:32:25 nice pytorch error messages
00:36:45 russian mode
00:38:00 python bottleneck
00:41:50 twitch chat not paying attention
00:42:50 pull request to facebook
00:43:50 hacker news, gotham chess plays chatgpt
00:45:30 chatbot loop
00:47:00 asking questions, bad enter
00:50:20 no ai filter
00:51:40 interesting output
00:53:50 sentencepieceprocessor decode_piece
00:58:00 removing extra enter
01:00:50 decode token to token, this is the problem
01:02:50 we learned how to take it out of russian mode
01:03:50 need demo answers, it’s called llama chat
01:06:30 be verbose in your answers
01:08:00 call back humor answer
01:08:50 pastebin improved pre prompts
01:17:00 not Gale, your name is Brian, cookies
01:18:00 what is your prompt
01:18:35 arbitrary python
01:22:45 impressive large language models
01:30:50 running on M1 Max
01:36:30 llama link
01:37:30 Brian running wild
01:39:00 real talk about ai safety
01:40:20 chat llama open source, running in web browser
01:42:20 funny response
01:43:45 connecting to the internet
01:45:00 model too small, too many tokens
01:47:00 darth vader, used car salesman name
01:50:30 Lana_Lux raiding stream, explaining to new people
01:51:15 what makes chatgpt good
01:52:00 fun answer
01:53:00 Gary only has 7B weights
01:54:10 chatml, need smarter model
02:00:00 what we should do?
02:01:00 bad rapper
02:03:20 stacy
02:07:15 limits of 7B weights
02:08:50 good answer
02:09:25 assuming gender
02:11:25 searching snapchat mentee.dreams
02:12:12 replica terrible, asking more questions, rap
02:17:10 13B weights smart stacy, vram requirements
02:19:30 running on cpu, took a day to write this in tinygrad
02:21:55 comma build the best self driving car, tinygrad taking over the world
02:22:40 removing code, repetition penalty
02:30:10 testing stacy
02:36:10 sp_model.decode_pieces, bos_id
02:43:20 fun answer, need to make large work
02:50:10 reading the LLaMA paper
02:52:10 anyone running this on his mac?, code pushed
02:53:45 macbook air m2
02:59:15 pytorch split file
03:01:30 trying to load the bigger model
03:13:00 python readinto
03:16:20 load only 13GB because of gc
03:18:20 making copy = bad
03:20:00 apple m1, m2 max, ultra memory
03:21:30 v.shape, tensors
03:23:00 real offer sponsor, tinycorp honest corp
03:32:10 millionaire
03:34:50 bad chatter, DEBUG=3, pointer, constant folded
03:38:40 the problem and easy fix, mailing the cookie for anyone who knows, thai food
03:44:30 what type is the output buffer
03:47:45 does not use ram because of fake tensors
03:49:55 mid model, loading 13B weight model
03:55:30 bad chat WEIGHTS=0
03:59:30 closing apps, chrome, discord
04:01:50 mac process disable swap, metal buffer force ram resident
04:05:00 force os x buffer to not swap, metal makr bugger as not swap, MTLResourceStorageModeShared
04:09:55 mac os x disable swap for single process, apple gpu supported types, pci-e 4 16x bandwidth
04:16:20 improving prompts, strassen algorithm
04:23:00 let’s go
04:27:25 getting food
04:35:00 marc andreessen
04:43:00 who is stacy
04:48:40 git commit add gary
04:49:00 context infinite
04:49:20 gary vs stacy
04:49:40 ai george hotz disappointment
04:50:20 stacy saying thanks for watching
04:51:00 llama branch 24GB ram min requirement, having fun with stacy this weekend

Official George Hotz communication channels:







We archive George Hotz and comma.ai videos for fun.
Follow for notifications:

Thank you for reading and using the SHOW MORE button.
We hope you enjoy watching George’s videos as much as we do.
See you at the next video.

Link do Vídeo






24 Comentários

  1. From $1499 buy comma.ai/shop/comma-three | github.com/geohot/tinygrad/blob/llama/examples/llama.py | https://github.com/geohot/tinygrad/tree/llama | tiny corp has a website tinygrad.org | Check out and star github.com/geohot/tinygrad | Support George by subscribing twitch.tv/subs/georgehotz | Follow George on instagram.com/georgehotz & twitch.tv/georgehotz & twitter.com/comma_ai to be up to date | Read George's geohot.github.io/blog/ | Programming archive🡿 youtube.com/playlist?list=PLzFUMGbVxlQs5s-LNAyKgcq5SL28ZLLKC

    Chapters:

    00:00:00 muted intro

    00:00:50 missing the rant about bad mood

    00:01:35 not family friendly content

    00:01:55 hardware to run tinygrad

    00:02:10 don't mess with george

    00:02:35 what we did yesterday

    00:05:00 where are we wasting time, not limited by gpu

    00:06:20 python snakeprof

    00:10:30 tinygrad discord llama channel, hacker news blow up

    00:10:45 llama weights on torrent

    00:11:20 does it run on nvidia 3090, ram

    00:13:30 open assistant product, prompt engineering

    00:21:10 no limits, don't believe in ai safety

    00:21:40 no guns, ai stupid

    00:23:55 what you need to torrent for weights

    00:27:25 torch.triu

    00:32:25 nice pytorch error messages

    00:36:45 russian mode

    00:38:00 python bottleneck

    00:41:50 twitch chat not paying attention

    00:42:50 pull request to facebook

    00:43:50 hacker news, gotham chess plays chatgpt

    00:45:30 chatbot loop

    00:47:00 asking questions, bad enter

    00:50:20 no ai filter

    00:51:40 interesting output

    00:53:50 sentencepieceprocessor decode_piece

    00:58:00 removing extra enter

    01:00:50 decode token to token, this is the problem

    01:02:50 we learned how to take it out of russian mode

    01:03:50 need demo answers, it's called llama chat

    01:06:30 be verbose in your answers

    01:08:00 call back humor answer

    01:08:50 pastebin improved pre prompts

    01:17:00 not Gale, your name is Brian, cookies

    01:18:00 what is your prompt

    01:18:35 arbitrary python

    01:22:45 impressive large language models

    01:30:50 running on M1 Max

    01:36:30 llama link

    01:37:30 Brian running wild

    01:39:00 real talk about ai safety

    01:40:20 chat llama open source, running in web browser

    01:42:20 funny response

    01:43:45 connecting to the internet

    01:45:00 model too small, too many tokens

    01:47:00 darth vader, used car salesman name

    01:50:30 Lana_Lux raiding stream, explaining to new people

    01:51:15 what makes chatgpt good

    01:52:00 fun answer

    01:53:00 Gary only has 7B weights

    01:54:10 chatml, need smarter model

    02:00:00 what we should do?

    02:01:00 bad rapper

    02:03:20 stacy

    02:07:15 limits of 7B weights

    02:08:50 good answer

    02:09:25 assuming gender

    02:11:25 searching snapchat mentee.dreams

    02:12:12 replica terrible, asking more questions, rap

    02:17:10 13B weights smart stacy, vram requirements

    02:19:30 running on cpu, took a day to write this in tinygrad

    02:21:55 comma build the best self driving car, tinygrad taking over the world

    02:22:40 removing code, repetition penalty

    02:30:10 testing stacy

    02:36:10 sp_model.decode_pieces, bos_id

    02:43:20 fun answer, need to make large work

    02:50:10 reading the LLaMA paper

    02:52:10 anyone running this on his mac?, code pushed

    02:53:45 macbook air m2

    02:59:15 pytorch split file

    03:01:30 trying to load the bigger model

    03:13:00 python readinto

    03:16:20 load only 13GB because of gc

    03:18:20 making copy = bad

    03:20:00 apple m1, m2 max, ultra memory

    03:21:30 v.shape, tensors

    03:23:00 real offer sponsor, tinycorp honest corp

    03:32:10 millionaire

    03:34:50 bad chatter, DEBUG=3, pointer, constant folded

    03:38:40 the problem and easy fix, mailing the cookie for anyone who knows, thai food

    03:44:30 what type is the output buffer

    03:47:45 does not use ram because of fake tensors

    03:49:55 mid model, loading 13B weight model

    03:55:30 bad chat WEIGHTS=0

    03:59:30 closing apps, chrome, discord

    04:01:50 mac process disable swap, metal buffer force ram resident

    04:05:00 force os x buffer to not swap, metal makr bugger as not swap, MTLResourceStorageModeShared

    04:09:55 mac os x disable swap for single process, apple gpu supported types, pci-e 4 16x bandwidth

    04:16:20 improving prompts, strassen algorithm

    04:23:00 let's go

    04:27:25 getting food

    04:35:00 marc andreessen

    04:43:00 who is stacy

    04:48:40 git commit add gary

    04:49:00 context infinite

    04:49:20 gary vs stacy

    04:49:40 ai george hotz disappointment

    04:50:20 stacy saying thanks for watching

    04:51:00 llama branch 24GB ram min requirement, having fun with stacy this weekend

  2. Maybe you should've preprompted it more to not give short answers and "I don't know". I think also once it says it doesn't know it will just keep repeating that. I have no idea if it's possible, but if you could do something like negative prompts (like in stable diffusion) here you could give examples of answers you don't want.

  3. llama.cpp is also super-easy to run on an M1 processor. I was able to run the 13B and 30B models on a 32GB M1 MAX

  4. I love how George dodge the question of the chatter with "l want food",
    this reminds of what an AI researcher (stuart russel) was saying when asking things to people in the field :
    That it looks like everytime the risk was underlined, people were automatically discarding whatever serious safety question by "yeah just turn it off", or "nah I don't know we'll figure it out".
    Come on George, you're better than that aren't you ?

    Tho you're defo one of the good guys.

  5. Can you program me software for debian that spoofs my device so if I go to a device info checker website they will think I use IOS? I am poor but I can pay you 300 for it sometime this year after one of my paychecks if you are interested. I'd like to make it available to everyone on my website too, I'd make sure to put your name on it and make a whole video glorifying you as a computer God. 😂

  6. Can humanity expand its activities in a limited space? Brilliant brains must find a way to better deter those with destructive brains…

Comentários estão fechados.