This is inaccurate. AES is a block cipher, whereas salsa/cha-cha are stream ciphers. Block ciphers are easy to accelerate in hardware, as they act on "blocks" of data at a time, whereas streams almost go byte by byte
Typical stream cipher produces stream of bits (not even bytes), and often are described in manner that can be readily converted into hardware, also most of such ciphers are non trivial to implement efficiently in software.
Stream ciphers done in software that are actually somewhat widely used are either based on iterating some block cipher like primitive (which may be purpose designed as in Salsa/Chacha) or are related to RC4.
IIRC the fact that you can derive stream cipher by iterating essentially any cryptographic primitive (eg. hash function) was one of the arguments used by DJB in his court case against US.
What do you mean by typical stream ciphers? AFAIK the most common stream ciphers are A5/1 (and A5/2) used in GSM, Snow3G used in 3G and LTE, E0 used in Bluetooth and RC4 for WPA in Wifi.
Of these A5/1 generates bursts of 114 bits, E0 generates two bits at a time, Snow3G generates 32-bit words and RC4 generates bytes.
Implementing A5/1 in SW is not easy, but Snow3G can be efficiently implemented in SW. For RC4 there are many high performance implementations in SW.
I though agree that A5/1, E0 and Snow3G are designed to be efficiently implemented in HW.
Besides these algorithms block ciphers in stream cipher modes (esp CTR) are used a lot. KASUMI in 3G, LTE and AES in IEEE 802.15.4 (CCM mode) and WPA2 for example.
A5/1 is probably perfect example of what I had in mind as it generates output one bit at a time and it's output has quite large period, the fact that in GSM it's used to generate pair of 114bit keystreams is somewhat irelevant to that. All three ciphers in eSTREAM hardware profile are specified in same way (although all of them are designed in a way that allows for more output bits to be computed in parallel)