dayDreams ++

Getting new Lobste.rs feed with Python and Telegram

Posted on Mar 28, 2021 | 3 minutes

It’s been some time since I written a new technical post. It was because I took a small break from writing code. I have no idea why but it wasn’t worth it anyways.

Back to the point, I’ve recently discovered https://lobste.rs and found the discussions quite good. But as anyone who writes code, I was lazy to open my browser and navigate to the website. Telegram is one of the major apps I use to get my feed, so I thought why not write a script to get the newest posts from lobste.rs and read and join the discussion. It’s easier for me.

So for a problem like this, my mind went to Python. I instantly set up a environment and installed the main packages like feedparser and the deta library. I’ll specify the need of the deta library later.

Why I chose to parse the Atom feed was because lobsters didn’t have a official API , but there are some endpoints like /hottest but it spews a lot of data which I don’t have any need for. Also I hadn’t used feedparser before so this was a good opportunity to get comfortable with it.

Parsing the Feed

This is quite straight forward. Feedparser was built for it.

import feedparser

LOB_URL:str = "https://lobste.rs/rss"

def getfeedData():
    feeds = feedparser.parse(LOB_URL)
    for feed in feeds['entries']:
        title = feed.title
        uri = feed.link
        _id = feed.id
        key = _id.split("/")[-1]
        comments = feed.comments
        feedDict = {"title":title,"url":uri,"post":_id,"comments":comments,"key":key}
    return feedDict

This function was more that enough to get the required data. We’ll update the code for adding further features. The said code is just straighforward. This is the XML for the feed

      <item>
        <title>How safe is zig?</title>
        <link>https://scattered-thoughts.net/writing/how-safe-is-zig/</link>
        <guid isPermaLink="false">https://lobste.rs/s/v5y4jb</guid>
        <author>[email protected] (adaszko)</author>
        <pubDate>Sat, 20 Mar 2021 02:52:03 -0500</pubDate>
        <comments>https://lobste.rs/s/v5y4jb/how_safe_is_zig</comments>
        <description>

            &#60;p&#62;&#60;a href=&#34;https://lobste.rs/s/v5y4jb/how_safe_is_zig&#34;&#62;Comments&#60;/a&#62;&#60;/p&#62;
        </description>
          <category>c</category>
          <category>rust</category>
          <category>zig</category>
      </item>

Connecting it to Telegram

Connecting this data to Telegram is also quite straightforward and easy. The only requirements for doing this is to get a Telegram Bot API token and the Chat ID. You can refer here for getting started. So we use Telegram’s BOT HTTP API to send messages to the specified chat id via a bot. Here is the code for sending that message. I have made into a function so it can be reused.

import requests

bot_token:str = "Telegram bot token"
chat_id:str = "telegram chat id"

def sendTGMessage(message:str)->None:
    url = f'https://api.telegram.org/bot{bot_token}/sendMessage'
    msg_data = {'chat_id':chat_id,'text':message,"parse_mode":"Markdown"}
    resp = requests.post(url, msg_data).json()
    if resp['ok'] is False:
        print("Message Not Send")
    else:
        print("👉    Message Sent")

def createMessages(feed:List)->str:
    for post in feed:
        message = f"""
**{post["title"]}**

[Article]({post["url"]}) | [Comments]({post["post"]}) | [Lobster Post]({post['comments']})

            """
        sendTGMessage(message)
    return None

This does the task given quite well, but it doesn’t understand the new posts. The workaround I found that when I wrote this script was to save the ID to a small db and if that ID doesn’t exist there, then that would be counted as a new story/post. Later after thinking about this, I could’ve implemented a time based comparision system. As of now, I’ll share the code for the DB system rather than the time based system.

New Posts

As I mentioned, the small DB I could think of was Deta Base. It has got a small API and way easy to use. You can read more on setting up a Deta Base here.

Deta bases have a key for each row/entry added. It can be manually set or Autogenerated. If we set it manually, we can use a try-except statement to check if that key already exists there. It throws a HTTP Error if the Key already exists.

from urllib.error import HTTPError
from deta import Deta,app
from typing import List

db = Deta("deta-key").Base("db-name")
def getfeedData()->List:
    feeds = feedparser.parse(LOB_URL)
    feedList = []
    for feed in feeds['entries']:
        title = feed.title
        uri = feed.link
        _id = feed.id
        key = _id.split("/")[-1]
        comments = feed.comments
        feedDict = {"title":title,"url":uri,"post":_id,"comments":comments,"key":key}
        try:
            ls = db.insert(feedDict)
            feedList.append(ls)
        except HTTPError:
            print("Key already exists")
    return feedList

This is the updated code for the getfeedData() function.

Running it on Time.

We installed the Deta library for this purpose only, the DB part came later to my thought train. Deta has a wonderful feature for setting CRON tasks. We can specify a function in our code to run at a specified time or at regular intervals.

@app.lib.cron()
def MashUppppp(event):
    feed = getfeedData()
    createMessages(feed)
    return "Cron Run Successfull"

I had deployed this on deta too. Refer here for more info on deployiong part too.

Here is the full code to running it

import feedparser
import requests
from urllib.error import HTTPError
from deta import Deta,app
from typing import List

db = Deta("deta key").Base("db name")

LOB_URL:str = "https://lobste.rs/rss"
bot_token:str = "Telegram bot token"
chat_id:str = "telegram chat id"

def getfeedData()->List:
    feeds = feedparser.parse(LOB_URL)
    feedList = []
    for feed in feeds['entries']:
        title = feed.title
        uri = feed.link
        _id = feed.id
        key = _id.split("/")[-1]
        comments = feed.comments
        feedDict = {"title":title,"url":uri,"post":_id,"comments":comments,"key":key}
        try:
            ls = db.insert(feedDict)
            feedList.append(ls)
        except HTTPError:
            print("Key already exists")
    return feedList

def sendTGMessage(message:str)->None:
    url = f'https://api.telegram.org/bot{bot_token}/sendMessage'
    msg_data = {'chat_id':chat_id,'text':message,"parse_mode":"Markdown"}
    resp = requests.post(url, msg_data).json()
    if resp['ok'] is False:
        print("Message Not Send")
    else:
        print("👉    Message Sent")

def createMessages(feed:List)->str:
    for post in feed:
        message = f"""
**{post["title"]}**

[Article]({post["url"]}) | [Comments]({post["post"]}) | [Lobster Post]({post['comments']})

            """
        sendTGMessage(message)
    return None

@app.lib.cron()
def MashUppppp(event):
    feed = getfeedData()
    createMessages(feed)