I am using the function below to match URLs inside a given text and replace them for HTML links. The regular expression is working great, but currently I am only replacing the first match.
How I can replace all the URL? I guess I should be using the exec command, but I did not really figure how to do it.
function replaceURLWithHTMLLinks(text) {
var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/i;
return text.replace(exp,"<a href='$1'>$1</a>");
}
Best Answer
First off, rolling your own regexp to parse URLs is a terrible idea. You must imagine this is a common enough problem that someone has written, debugged and tested a library for it, according to the RFCs. URIs are complex - check out the code for URL parsing in Node.js and the Wikipedia page on URI schemes.
There are a ton of edge cases when it comes to parsing URLs: international domain names, actual (
.museum
) vs. nonexistent (.etc
) TLDs, weird punctuation including parentheses, punctuation at the end of the URL, IPV6 hostnames etc.I've looked at a ton of libraries, and there are a few worth using despite some downsides:
href
attribute inside anchor () tags"). I'll thrown some tests at it when a demo becomes available.Libraries that I've disqualified quickly for this task:
If you insist on a regular expression, the most comprehensive is the URL regexp from Component, though it will falsely detect some non-existent two-letter TLDs by looking at it.