Update readme and installation steps
This commit is contained in:
2
Makefile
2
Makefile
@@ -17,7 +17,7 @@ feed_cleanup: ## Cleanup RSS feeds
|
||||
@python3 ./scripts/cleanup.py
|
||||
|
||||
feed_init: ## Initialize feeds from boards.yml
|
||||
@python3 ./scripts/initialize.py
|
||||
@python3 ./scripts/initialize.py --config boards.yml --no-upload-favicons -y
|
||||
|
||||
feed_refresh: ## Refresh RSS feeds
|
||||
@python3 ./scripts/update.py
|
||||
|
||||
120
README.md
120
README.md
@@ -1,21 +1,121 @@
|
||||
# infomate.club
|
||||
# 😋 [infomate.club](https://infomate.club) [](https://travis-ci.org/vas3k/infomate.club)
|
||||
|
||||
Experimental project
|
||||
Infomate is a small web service that shows multiple RSS sources on one page and performs tricky parsing and summarizing articles using TextRank algorithm.
|
||||
|
||||
### Build and run
|
||||
It helps to keep track of news from different areas without subscribing to hundreds of media accounts and getting annoying notifications.
|
||||
|
||||
```shell script
|
||||
Thematic and people-based collections does a really good job for discovery of new sources of information.
|
||||
Since we all are biased, such compilations can really help us to get out of information bubbles.
|
||||
|
||||
Live URL: [infomate.club](https://infomate.club)
|
||||
|
||||

|
||||
|
||||
## This is a pet-project 🐶
|
||||
|
||||
Which means you really shouldn't expect much from it. I wrote it over the weekend to solve my own pain.
|
||||
No state-of-art kubernetes bullshit, no architecture patterns, even no tests at all.
|
||||
It's here just to show people what a pet-project might look like.
|
||||
|
||||
I wrote this code for fun, not for work. That's usually a huge difference.
|
||||
|
||||
|
||||
Like between riding a bike on the streets and cycling in the wild for fun :)
|
||||
|
||||
## How it works
|
||||
|
||||
It's basically a Django web app with a bunch of [scripts](scripts) for RSS parsing.
|
||||
It stores the parsed data in a PostgreSQL database.
|
||||
|
||||
The web app is only used to show the data (with heavy caching).
|
||||
Parsing and feed updates are performed by the three scripts running in cron. Like poor people do.
|
||||
|
||||
[Feedparser](https://pythonhosted.org/feedparser/) and [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) are used to find, download and parse RSS.
|
||||
|
||||
Text summarization is done via [newspaper3k](https://newspaper.readthedocs.io/en/latest/) with some additional
|
||||
protection against bad types of content like podcasts and too big pages in general, which can eat all your memory. Anything can happen in the RSS world :)
|
||||
|
||||
## Running it locally
|
||||
|
||||
The easy way. Install [docker](https://docs.docker.com/install/) on your machine. Then:
|
||||
|
||||
```
|
||||
git clone git@github.com:vas3k/infomate.club.git
|
||||
ls infomate.club
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
### Run
|
||||
After that navigate to [localhost:8000](http://localhost:8000)
|
||||
|
||||
```shell script
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
### Terminate
|
||||
To terminate:
|
||||
|
||||
```shell script
|
||||
docker-compose down --remove-orphans
|
||||
```
|
||||
|
||||
|
||||
## Running for development
|
||||
|
||||
Make sure you have python3 and postresql installed locally.
|
||||
|
||||
#### Step 1: Install requirements
|
||||
|
||||
```
|
||||
pip3 install -r requirements.txt --user
|
||||
```
|
||||
|
||||
#### Step 2: Create a database structure
|
||||
|
||||
```
|
||||
python3 manage.py migrate
|
||||
```
|
||||
|
||||
#### Step 3: Take a look at [boards.yml](boards.yml)
|
||||
|
||||
This is the main source of truth for all RSS streams and collections in the service.
|
||||
All updates to the database are made through it. For the first time you can just use the existing one.
|
||||
|
||||
#### Step 4: Initialize your feeds
|
||||
|
||||
```
|
||||
python3 scripts/initialize.py --config boards.yml
|
||||
```
|
||||
|
||||
> Every time you make a change to boards.yml, just run this script again.
|
||||
> He is smart enough to create the missing ones and remove the old ones.
|
||||
|
||||
#### Step 5: Fetch some articles
|
||||
|
||||
```
|
||||
python3 scripts/update.py
|
||||
```
|
||||
|
||||
> Don't run it too often, otherwise sites may ban your IP.
|
||||
> There is a hardcoded cooldown interval for each feed, but you can use `--force` flag to ignore it.
|
||||
|
||||
#### Step 6: Run dev server
|
||||
|
||||
```
|
||||
python3 manage.py runserver 8000
|
||||
```
|
||||
|
||||
Then go to [localhost:8000](http://localhost:8000) again
|
||||
|
||||
## boards.yml format
|
||||
|
||||
TBD
|
||||
|
||||
## Contributing
|
||||
|
||||
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
|
||||
|
||||
You can help us with opened issues too. There's always something to work on.
|
||||
|
||||
We don't have any strict rules on formatting, just explain your motivation and the changes you've made to the PR description so that others understand what's going on.
|
||||
|
||||
## License
|
||||
|
||||
[Apache 2.0](LICENSE) © Vasily Zubarev
|
||||
|
||||
> TL;DR: you can modify, distribute and use it commercially,
|
||||
but you MUST reference the original author or give a link to service
|
||||
117
boards.yml
117
boards.yml
@@ -1,46 +1,18 @@
|
||||
boards:
|
||||
- name: Технологии
|
||||
- name: Tech
|
||||
slug: tech
|
||||
is_visible: true
|
||||
is_private: false
|
||||
is_visible: true # visibility on the main page
|
||||
is_private: false # logging in is not required to view
|
||||
curator:
|
||||
name: Технологии
|
||||
title: Главные новости
|
||||
name: Tech
|
||||
title: Main news
|
||||
avatar: https://i.vas3k.ru/fhr.png
|
||||
bio: Подборка основных изданий о технологиях на русском и английском языках
|
||||
bio: Major technology media in English and Russian
|
||||
footer: >
|
||||
это общая подборка популярных технологических СМИ.
|
||||
Она сделана по итогам опроса в моём телеграм-канале.
|
||||
Жирным выделяются свежие статьи.
|
||||
Страница обновляется раз в час.
|
||||
this is a general selection of popular technology media.
|
||||
The page is updated once per hour.
|
||||
blocks:
|
||||
- name: На русском
|
||||
slug: ru
|
||||
feeds:
|
||||
- name: "vc.ru: Технологии"
|
||||
url: https://vc.ru
|
||||
rss: https://vc.ru/rss/all
|
||||
conditions:
|
||||
- type: in
|
||||
field: link
|
||||
in: "https://vc.ru/tech/"
|
||||
- name: TJ
|
||||
url: https://tjournal.ru
|
||||
rss: https://tjournal.ru/rss/all
|
||||
- name: "Хабр: лучшее за сутки"
|
||||
url: https://habr.ru
|
||||
rss: https://habr.com/ru/rss/best/daily/?fl=ru
|
||||
- name: iXBT
|
||||
url: https://www.ixbt.com
|
||||
rss: http://www.ixbt.com/export/news.rss
|
||||
icon: https://i.vas3k.ru/fkm.jpg
|
||||
- name: Tproger
|
||||
url: https://tproger.ru/
|
||||
rss: https://tproger.ru/feed/
|
||||
- name: OpenNet
|
||||
url: https://www.opennet.ru/
|
||||
rss: https://www.opennet.ru/opennews/opennews_6.rss
|
||||
- name: На английском
|
||||
- name: English
|
||||
slug: en
|
||||
feeds:
|
||||
- name: Hacker News
|
||||
@@ -52,6 +24,7 @@ boards:
|
||||
- name: TechCrunch
|
||||
rss: http://feeds.feedburner.com/TechCrunch/
|
||||
url: https://techcrunch.com
|
||||
is_parsable: false # do not try to parse pages, show RSS content only
|
||||
- name: Engadget
|
||||
rss: https://www.engadget.com/rss.xml
|
||||
url: https://www.engadget.com
|
||||
@@ -86,7 +59,34 @@ boards:
|
||||
- name: ReadWrite
|
||||
url: https://readwrite.com
|
||||
rss: https://readwrite.com/feed/
|
||||
- name: Микс блогов
|
||||
- name: Russian
|
||||
slug: ru
|
||||
feeds:
|
||||
- name: "vc.ru"
|
||||
url: https://vc.ru
|
||||
rss: https://vc.ru/rss/all
|
||||
is_parsable: false
|
||||
conditions:
|
||||
- type: in
|
||||
field: link
|
||||
in: "https://vc.ru/tech/" # just an example, no real benefits
|
||||
- name: TJ
|
||||
url: https://tjournal.ru
|
||||
rss: https://tjournal.ru/rss/all
|
||||
- name: "Habr.com"
|
||||
url: https://habr.com
|
||||
rss: https://habr.com/ru/rss/best/daily/?fl=ru
|
||||
- name: iXBT
|
||||
url: https://www.ixbt.com
|
||||
rss: http://www.ixbt.com/export/news.rss
|
||||
icon: https://i.vas3k.ru/fkm.jpg
|
||||
- name: Tproger
|
||||
url: https://tproger.ru/
|
||||
rss: https://tproger.ru/feed/
|
||||
- name: OpenNet
|
||||
url: https://www.opennet.ru/
|
||||
rss: https://www.opennet.ru/opennews/opennews_6.rss
|
||||
- name: Mix
|
||||
slug: the_mix
|
||||
feeds:
|
||||
- url: http://www.rssmix.com/
|
||||
@@ -95,7 +95,7 @@ boards:
|
||||
mix:
|
||||
- http://vas3k.ru/rss/
|
||||
- http://nedbatchelder.com/blog/rss.xml
|
||||
- name: Мейнстрим
|
||||
- name: Mainstream
|
||||
slug: mainstream
|
||||
feeds:
|
||||
- name: "WSJ: Tech"
|
||||
@@ -109,31 +109,28 @@ boards:
|
||||
rss: http://feeds.reuters.com/reuters/technologyNews
|
||||
|
||||
|
||||
- name: Вастрик
|
||||
slug: vas3k
|
||||
is_visible: true
|
||||
is_private: true
|
||||
curator:
|
||||
name: Вастрик
|
||||
url: https://vas3k.ru
|
||||
title: Айти и путешествия
|
||||
avatar: https://i.vas3k.ru/eb8.png
|
||||
bio: Веду блог о технологиях, пишу код, отвратительно путешествую и фотографирую это
|
||||
footer: >
|
||||
здесь я собрал сайты, которые составляют 90% того, что я читаю постоянно.
|
||||
Отбор и фильтрация источников — непрерывный процесс для меня, потому их набор постоянно меняется.
|
||||
Так что следите.
|
||||
blocks:
|
||||
|
||||
- name: How to Berlin
|
||||
slug: howtoberlin
|
||||
is_visible: true
|
||||
is_private: true
|
||||
is_private: false
|
||||
curator:
|
||||
name: Лена How to Berlin
|
||||
name: How to Berlin
|
||||
url: https://howtoberlin.de
|
||||
title: Набор Берлинца
|
||||
title: Berliner kit
|
||||
avatar: https://i.vas3k.ru/fev.png
|
||||
bio: Что читать когда переехал в Берлин и не понимаешь что происходит вокруг
|
||||
bio: What to read when you moved to Berlin and you don't know what's going on around
|
||||
blocks:
|
||||
|
||||
- name: Main and expat news
|
||||
slug: news
|
||||
feeds:
|
||||
- name: "Berlin.de"
|
||||
url: https://www.berlin.de/aktuelles/
|
||||
rss: https://www.berlin.de/en/news/index.rss
|
||||
icon: https://i.vas3k.ru/fjc.png
|
||||
- name: "DW.com"
|
||||
url: https://www.dw.com/en/top-stories/germany/s-1432
|
||||
rss: http://rss.dw.com/rdf/rss-en-ger
|
||||
- name: "TheLocal"
|
||||
url: https://www.thelocal.de/
|
||||
rss: https://feeds.thelocal.com/rss/de
|
||||
is_parsable: false
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
version: '3.7'
|
||||
|
||||
services:
|
||||
infomate_app:
|
||||
infomate_app: &app
|
||||
build:
|
||||
context: .
|
||||
args:
|
||||
@@ -28,3 +28,10 @@ services:
|
||||
- POSTGRES_DB=infomate
|
||||
ports:
|
||||
- 5432
|
||||
|
||||
migrate_and_init:
|
||||
<<: *app
|
||||
container_name: infomate_migrate_and_init
|
||||
restart: "no"
|
||||
ports: []
|
||||
command: make migrate feed_init
|
||||
|
||||
@@ -41,7 +41,7 @@
|
||||
|
||||
{% block footer %}
|
||||
<div class="footer">
|
||||
Сделал <a href="https://vas3k.ru">Вастрик</a>.<br><br>
|
||||
Сделал <a href="https://vas3k.ru">Вастрик</a>. Код проекта <a href="https://github.com/vas3k/infomate.club">открыт</a>.<br><br>
|
||||
Сайт использует <a href="https://ru.wikipedia.org/wiki/Cookie" target="_blank">куки</a> для авторизации<br> и собирает <a href="{% url "privacy_policy" %}">анонимные данные</a> для статистики.
|
||||
{% if me %}
|
||||
<br><a href="{% url "logout" %}" class="button logout-button">Выйти</a>
|
||||
|
||||
Reference in New Issue
Block a user