Voice-over-IP is a generic term; what it describes depends somewhat on the context, but in general it refers to a way of sending voice, digitized and compressed on-the-fly, over TCP/IP networks (the Internet). TCP/IP is not a family of protocols that has time of travel as a parameter that can be controlled: different bits of sound get sent in different packets, those packets may arrive with variable delays, or sometimes not arrive at all, and the higher-level software has to do a lot of intelligent packet assembly, sometimes filling the gaps by a `best guess' of some sort, and almost inevitably with a bit of a delay, or latency. The latency of 30 ms or more is noticeable by a human in a full-duplex (speaking and listening) conversation, and somewhere above 200 ms the perceived quality of connection rapidly deteriorates. That is not easy to achieve over public IP, all plugged up by data traffic, and requires excellent compression algorithms, and fairly robust networks.
Unlike in the direct phone connection, the annoying feedback loop (the one that makes the mike catch some of the speaker's output and feed it back in, to yield a loud screech) is almost never encountered in VoIP as the algorithm must buffer a bit of sound (typically 30-50 ms) and that bit is longer than any period of any frequency that might matter. Instead, the annoying equivalent for VoIP is a chunky ``echo'': the perception of the talker of his/her own voice returning back, delayed. Because the echo usually arises from the acoustic (or electrical) pickup at the listener's end, it is usually a much weaker signal, and there is not enough of it to build up to an annoying screech, but still you feel like you are talking into a well, and it gets annoying very quickly. However, echo cancellation algorithms exist, and are quite good at reducing this effect to a bare minimum on full-duplex hardware. There is, unfortunately, another kind of annoying effect associated with delays and time uncertainties of the IP transmission: loss of interactivity, rearing its ugly head when the signal delay is above 250 m one-way. To quote a good book on IP telephony:
A 400 ms one-way delay should be exceptional ... and is the limit after which the conversation can be considered half duplex.
When there are large delays on the line, the talker tends to think that the listener has not heard or paid attention. He will repeat what he said and be interrupted by the delayed response of the called party. Both will stop talking -- and restart simultaneously. With some training it is possible to communicate correctly, but the conversation is not natural .
Just in case you expect miracles: a disclaimer. If you have a 28 kbps modem line, no hardware assist on the voice compression, an older and slower CPU, a non-duplex audio card, a ping turnaround time of 1000 ms or more with a 25% dropped packet rate, you will not be happy with the results. On the other hand, with hardware-assisted compression and echo cancellation, a 56 kbps or faster connection to the Internet, in off-peak hours, VoIP works well enough over existing IP networks in the US to call coast-to-coast. It also works extremely well over lightly loaded leased-line WANs or 10/100 Mbps LANs. The great hope is that as the long-distance infrastructure improves, Internet telephony will become a practical alternative for toll-free long-distance communications. I personally doubt it: somebody will find a way to fill whatever free bandwidth there is with ever bigger gif files. However, I'll be happy to be proven wrong. My point here is that VoIP is a completely valid approach to delivering -- right now! -- voice services over private managed networks (Intranets). At the same time, even a slightly choppy long-distance VoIP call can be exciting at zero price, or where a regular voice or cell phone connection is not available.