AI for Non-Techies
Raphaël: Hey folks, and welcome
to this latest episode of
The Small Tech Podcast by EC.
I'm your host Raph.
And today we're going to be talking
about AI before we get going, though.
Remember to like, and subscribe
if you like the work that
we do, we're a small team.
We'd really appreciate the support.
It helps a lot leave us a review and
if you want to join us on an episode
we just released our first interview.
And we'd love to have you
on, if you want to talk about
building small tech products.
So let's talk about AI.
AI has been in the news a lot lately
over the past year or two years.
But something that I started
to recognize recently.
Is that as the hype grows around it.
So do the misconceptions around it.
And while I'm no expert in the mechanics
of what goes on deep under the hood, I do
recognize some things that people don't
understand about what this technology is.
And specifically the stuff that
we're talking about these days.
So a little bit of history.
We've been talking about things that.
Really are just different
forms of statistical analyses.
As AI or machine learning
or whatever for a while now.
And so what we're
calling AI at the moment.
Is An extension of what we were
referring to as machine learning more
recently and big data prior to that.
And it's mixed in with other concepts
like neural networks and Gans and.
All kinds of other stuff.
And I remember playing with Tools based
off of generative adversarial networks
way back many years ago at this point.
But they didn't enter this
sort of hype cycle that we've
gotten into with AI more lately.
And I think the reason for that is.
The results, the outputs, the things
that you got out of those tools.
Either from a consumer perspective,
weren't as fun and exciting.
And from a business perspective
were great, but they didn't tap into
people's imagination in the same way
that the tools that we have now do.
So we've had AI being used to do all kinds
of stuff from predicting trends from large
datasets to parsing audio and generating
text in different ways for a while now.
The thing that's changed in the past
couple of years specifically, is what
we're now referring to as generative AI.
And I think specifically there's
two things in there that people
are particularly enamored with
and find fascinating to play with.
One is the image generators.
So being able to just describe
something and have an image
come out of that description.
Until recently there wasn't a
great way to generate decent
outputs for something like that.
But in the past few years, That
the results have gotten way
better than they were before.
And of course with that comes a
whole slew of controversies around
intellectual property and more, we're
not going to talk so much about that.
In this episode, but we will talk
a little bit about the mechanics
of what's going on under the hood.
The other thing is large language
models, which kind of do the same thing.
But with text.
So you provide some sort of input.
That might be a short prompt.
It might be a long prompt.
And basically you get something back
that either continues a conversation.
If you're using something in a
chat style, interface with an LLM.
Think of chat GPT.
Or you.
Or it might just complete
something that you started.
Though, we're seeing less
of that at the moment.
Functionally though under the hood,
that's what, even the chat interfaces do.
And here's where we'll start
to dive into the mechanics
because I've seen people and.
I want to talk specifically to
people who are not deeply technical.
So if you're an AI expert or you're a
developer, who's really deep into this
stuff, this probably isn't for you.
And if you do keep watching, you
might find yourself shaking your
fist at the screen if you're watching
or in the air, if you're listening
saying that's not quite right.
And I think that's okay.
I'm going to try and cover things in
a way that I think makes sense for
someone who is non-technical, but is
interested in what can be done with
these technologies or wants to better
understand them, wants to understand how
they might fit into a product roadmap.
See what they can build
with this technology.
Or even somewhat technical folks
who just haven't played with it yet.
I don't actually understand very well
what's going on with the image generators.
On some level what I understand is
essentially images are fed into the
training systems and don't think
of them as being stored in there.
It's more like information about
the images that's being stored.
And correlated and.
Mashed together into this
system that can then.
Correlate descriptions with those
metadata about the images and spit out
something that fits those correlations.
So if you fed a lot of images by van
Gogh into one of these systems, and
they're all described as van Gogh
and they might say oil painting as
well, somewhere in the description.
There might be descriptions of the
types of colors and the brush strokes
and all of those types of things that
really define a van Gogh painting.
During training, basically what
happens is the systems learn to
recognize those similarities.
Oh, we see things about the quality of the
brushstrokes and oil paint and van Gogh
and it correlates that with those loose.
With that style of painting with
those with the way that the pixels
are basically grouped next to each
other, the contrasts the colors.
The patterns, those types of things, and
then it can apply that to something else.
So if you have many images of dogs,
when those are being fed in and it says.
Dachshund or Husky or golden retriever
and the shapes of the dogs are parsed
and stored as metadata in the system.
They're not storing the images of
the dogs, but they're storing the
sort of rough shapes and ideas of how
a dog is, or those dog breeds are.
And then you can match those together.
And so if you have the shape of a
dog, but the texture of a Vango.
You get a painting of a golden
retriever in the style of van Gogh.
But of course there's limits to how
we describe images and paintings and.
All of these things.
And so you'll find when you play these.
So you'll find when you play with these
tools that they come up with really
weird stuff, and you can see that they
don't really understand, and you can
also see that they're clearly not copies.
Because things come out weirdly.
They don't make sense.
And the tools are getting a lot better at
that, but think of that as the way these
systems work, they understand patterns.
They understand.
Metadata about these images, they
bring in descriptions and keywords
and they mash those things together to
to deconstruct all of the training data
and mash it into correlational machine.
And then when you provide a
prompt, It goes from that sort
of loosey goosey something.
Building backup on those correlations
and spits out something that
may or may not make sense.
Of course, more and more, they are making
sense and they're doing things that.
They're generating outputs.
That look really good.
So do without what you will, maybe
that helps you understand a bit more
of the context for the controversies,
or it may give you some context
for how to work with these things.
But hopefully it helps.
The other thing is large language models.
You can think of it very similarly
to basically these systems.
During training.
Bring in.
Tons and tons of data about all kinds of
written texts from throughout history.
Again, controversies around
how those data are sourced.
We're not going to dive into that.
But basically you have massive data sets
of written language in various languages.
And it is really under the hood.
Just fancy auto-complete.
That is basically what these systems do.
But on a really large scale
when you're writing an essay.
A paragraph your writing is based
off of the previous paragraph you
wrote and the previous chapter
that you wrote or the research
that you have put into that paper.
Maybe you could imagine as being
before that further up the chain,
if you had all of your research
and your writing in one document.
You could think of it that way.
It's just you have all of these
inputs and then your next paragraph
is predictable in a sense, based off
of the inputs and the previous writing
that you've done, what comes next?
Essentially, that's what these
systems are trained to do.
They recognize patterns in language.
And so they know that if
you have a paragraph about.
How playful dogs are.
And then maybe your next
paragraph is going to be about
the joy that they bring to their.
Humans.
Something along those lines.
And that would depend on the rest of
the context of what you're writing.
But they're not coming up with.
They're not coming up with anything novel.
There is some randomness in this system.
That is intentionally placed there.
But it is really just auto-complete.
And even when you do the chat
interfaces, Functionally, what they're
doing is still doing auto-complete.
If I have a conversation with
someone and I say, hi, how are you?
Predictably?
The response will be something about
how the person I'm talking to is
doing, and probably asking me back.
Hey, how are you doing?
I'll respond and we carry
on with the conversation.
They're encoded in the language that
we've left strewn around the internet
and through books and other media.
So these systems find those correlations,
they understand the patterns
they're trained on those patterns.
If there's one word in front, then
what's the next word that comes after.
And fundamentally, they
just do long chains of that.
Based off of the training data.
So when people talk about these
hallucinations, for example,
I think it's really important to
understand that the outputs of these
systems are just word predictions.
They're not knowledge predictions.
They're not logical predictions.
They don't understand.
What's going into them really?
You could have philosophical
debates about whether we understand
what's going into our brains.
As we hear words and we parse them.
I'm not going to go into that,
but the mechanics of it is here
are some words, what are the next
logical words that should follow?
And that also means that these systems
don't have real-time access to data.
Now there's a caveat there.
And we're going to talk
about that in a second.
But if you're just chatting with something
like ChatGPT or Claude or any of the other
ones, And you're not using a version of
them that is connected to the internet.
Their data that they're
trained on is potentially old.
And it doesn't have context.
Now there are techniques that you
can use and that are being used
by all kinds of systems, including
search systems that use AI's.
So if you're thinking about
using Bing chat or if you're
on a paid tier of ChatGPT.
You can also ask it to make
sure that when it responds to
you, sources from the internet.
Or maybe you're using a
search system like perplexity.
All of those systems, use a
technique called RAG retrieval,
augmented generation.
Now, what that does is it uses I'm going
to say algorithms, basically methods to
figure out what is most relevant based
off of your query in what fundamentally,
it looks like a traditional search index.
If you just type into Google.
Recipes for apple pie.
It will go find things and it'll
give you a list of results.
Now what?
Bing chat and ChatGPT, one
connected to the web on the paid
tier or perplexity do is the kind
of just do a normal web search.
And then they add that text into the
frame, the context of your question.
So when they respond.
It might not be there literally.
But under the hood what's
happening is they're pulling
paragraphs from those sources.
And saying, okay.
If you asked.
How do I make apple pie?
Let's go fetch these recipes.
Add them into the context.
In a way that's hidden and
then respond with that context.
So now when the fancy auto-complete
keeps going, it has both the context of
the question, but also those paragraphs
about how to bake an apple by and so
when it responds, it will give you stuff.
That's a lot more accurate.
Now you can also get fancy with your
prompting and basically tell it.
Do not use anything other than valid
sources and then it should understand
from the context that whatever comes
back from a search is the valid
source that you're looking for.
You can also tell it
only return results with.
URLs that you can visit.
So you can then go validate those
sources yourself, but you can imagine
how you can integrate those types of
techniques into your own products.
Let's say you've got a note taking
app with a database of notes.
You need a system to parse a user's input.
Who's asking.
What am I doing on a Thursday?
So you might take that input, convert that
into a more standard search algorithm.
You say what date is Thursday?
Filter out notes.
Grab the text from those notes.
And then tell the large language
model that you're using.
Hey, this person wants to
know what they're doing.
Here are some notes that are
relevant to this coming Thursday.
Respond to them only with that
context, do not include anything else.
And now you've returned
that response to the user.
So those are some ways to like,
think about what's happening in large
language models, how they can be used
appropriately to reduce hallucinations.
What is really happening when you
interact with one of these models.
And how they may or may not
make sense in a product.
Now there's so much that we're
going to figure out as a society
over the coming years about how
training data should be sourced.
How the system should be used.
How they should be regulated.
But I think personally that there's
something really interesting and
exciting about the way that these
systems work, particularly large language
models, because they fundamentally
become a different way of interacting
with information and knowledge.
The way I see it is as long as they're
paired with search or search like systems.
They become a new interface for the
data that we store, the stories that
we have the knowledge that we share.
There are valid debates to be had
about how the training data should
be sourced and how we should oversee
the training of these systems.
I think it's undeniable that there's
value in being able to parse language
better than we have been using.
Computers, which fundamentally have not
been able to parse language well, So far.
And I think it just opens
up so many possibilities for
things that we can create.
Because so much of what we are as a
modern society is encoded in language
and language that is stored in ways
that can be used to build these systems.
Yeah, I think we need to be careful.
I think there does need to be
regulation primarily about the outputs.
I'm less worried about the inputs.
And I think there need to be best
practices in place for practitioners
who want to use these systems.
Within their own products.
But hopefully this episode helped you
understand what's going on under the hood.
What are the possibilities, how you
might want to engage with these tools?
And yeah, I hope it was informative.
Alrighty.
Thanks for listening folks.
If you enjoy this stuff,
please subscribe on YouTube.
And subscribe to the podcast
in your podcast, app of choice.
Also leave us a rating and
review really helps us out.
We'd love to hear what you think.
And I'd really personally love
to have you on the podcast.
I'd love to talk to other
people about this stuff.
Also make sure to sign up for our
newsletter where we'll be sending
you all kinds of great info about
how to build a small tech product.
There's going to be videos, blog posts,
episodes of the podcast you may have
missed and plenty of other stuff.
So head to small tech
podcast.com and subscribe there.
So that's it for this week's episode and
we all want to do good in the world folks.
So go out there and build something.
Good.
See ya.