What’s an Activation Function?
Think of a neuron in a neural network like a little decision-maker.
It receives some numbers (inputs), does math (weighted sum), and then has to decide:
“Should I activate strongly, weakly, or not at all?”
The activation function is the rule it uses to make that decision.
⚙️ Why Do We Need It?
If we didn’t use activation functions, the network would just be doing linear math (straight lines only).
That means:
No matter how many layers we stack, it would still behave like a single straight line → can’t learn complex patterns.
Activation functions add non-linearity, letting the network learn curves, shapes, patterns, etc.
🚦 Common Activation Functions (and how they behave)
| Function | Idea | Output Range | Example Use |
|---|---|---|---|
| ReLU (Rectified Linear Unit) | If input > 0, keep it; else 0 | 0 → ∞ | Most common in hidden layers |
| Sigmoid | S-shaped curve; squashes values between 0 and 1 | 0 → 1 | Binary classification outputs |
| Tanh | Like sigmoid but between -1 and 1 | -1 → 1 | Hidden layers (older networks) |
| Leaky ReLU | Like ReLU but allows small slope when input < 0 | (-∞ → ∞) | Fixes “dead neuron” issue |
No comments:
Post a Comment