I have about 30 reddit feeds on my self hosted miniflux instance and they all started failing with '403' errors. I tried curling like so from the same machine that the instance is hosted on:
> curl https://www.reddit.com/r/gaming/top.rss -vvI -X GET -H "User-Agent: Mozilla/5.0 (compatible; Miniflux/2.0.36; +https://miniflux.app)" -H 'Connection: close' --http1.1 -H
'Accept-Encoding: gzip'
* Trying 151.101.1.140:443...
* TCP_NODELAY set
* Connected to www.reddit.com (151.101.1.140) port 443 (#0)
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
* subject: C=US; ST=CALIFORNIA; L=SAN FRANCISCO; O=Reddit Inc.; CN=*.reddit.com
* start date: Feb 17 00:00:00 2022 GMT
* expire date: Aug 16 23:59:59 2022 GMT
* subjectAltName: host "www.reddit.com" matched cert's "*.reddit.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
> GET /r/gaming/top.rss HTTP/1.1
> Host: www.reddit.com
> Accept: */*
> User-Agent: Mozilla/5.0 (compatible; Miniflux/2.0.36; +https://miniflux.app)
> Connection: close
> Accept-Encoding: gzip
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
I tried modifying the miniflux user agent for this feed to use my browser's user agent and I still get 403 when I try to update/refresh/add reddit feeds. I tried going inside the container itself and running wget on a feed (as curl isn't available) and that also worked. Finally, I enabled debug logging and collected the following when trying to add a feed:
[DEBUG] [HttpClient:Before] Method=GET InputURL="https://www.reddit.com/r/gaming/top.rss" ETag=None LastMod=None Auth=false UserAgent="Mozilla/5.0 (compatible; Miniflux/2.0.36; +https://miniflux.app)" Verify=true
[DEBUG] [HttpClient:After] Method=GET InputURL="https://www.reddit.com/r/gaming/top.rss" ETag=None LastMod=None Auth=false UserAgent="Mozilla/5.0 (compatible; Miniflux/2.0.36; +https://miniflux.app)" Verify=true; Response => StatusCode=403 EffectiveURL="https://www.reddit.com/r/gaming/top.rss" LastModified="" ETag= Expires= ContentType="text/plain" ContentLength=7
[DEBUG] [HttpClient] inputURL=https://www.reddit.com/r/gaming/top.rss took 93.221736ms
[ERROR] [UI:SubmitSubscription] Unable to fetch this resource (Status Code = 403)
as well as this one when trying with my browser's UA
[DEBUG] [HttpClient:Before] Method=GET InputURL="https://www.reddit.com/r/gaming/top.rss" ETag=None LastMod=None Auth=false UserAgent=" Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36" Verify=true
[DEBUG] [HttpClient:After] Method=GET InputURL="https://www.reddit.com/r/gaming/top.rss" ETag=None LastMod=None Auth=false UserAgent=" Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36" Verify=true; Response => StatusCode=403 EffectiveURL="https://www.reddit.com/r/gaming/top.rss" LastModified="" ETag= Expires= ContentType="text/plain" ContentLength=7
[DEBUG] [HttpClient] inputURL=https://www.reddit.com/r/gaming/top.rss took 80.981398ms
What else is different about how miniflux calls reddit? Is there a way for me to read the content of that 403 that's coming back? I don't think that I'm getting rate limited because when I call the feed with no user agent, I get a 429 with a brief message saying that I should use a user agent.