Working on a Ubuntu 18.04 instance, and a specific client domain’s MX record kept returning SERVFAIL. Example:
dig clientdomain.ca -t mx
Would return:

When trying:
dig client domain.ca -t mx @1.1.1.1
The result would return fine. All other tested domains worked fine without specifying NS 1.1.1.1
This led me to believe that there’s an issue with systemd-resolve cache. I ran through a series of tests starting with flushing the cache, all with no success.
I was able to find a work-around by bypassing the local 127.0.0.53 DNS cache and changing the symlink to the conf by:
~$ ls -al /etc/resolv.conf
lrwxrwxrwx 1 root root 39 Oct 3 16:43 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
sudo rm -i /etc/resolv.conf #remove old symlink
sudo ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf #recreate symlink
The dig requests now responded properly, but I wasn’t satisfied with the solution. WHY was the cache failing. Upon further investigation, and I clearly missed it, looking at the MX record result shows:
;; ANSWER SECTION:
clientdomain.ca. 1800 IN MX 20 mx2-us1.hostedemail.com.
clientdomain.ca. 0 IN MX 10 mx1-us1.hostedemail.com.
See it? TTL 0. The systemd-resolve cannot cache a TTL 0, so it shows SERVFAIL.
Solution: client has changed their TTL.
Recent Comments