mirror of
https://codeberg.org/scip/kleingebaeck.git
synced 2025-12-16 03:51:02 +01:00
248 lines
6.2 KiB
Plaintext
248 lines
6.2 KiB
Plaintext
=head1 NAME
|
|
|
|
kleingebaeck - kleinanzeigen.de backup tool
|
|
|
|
=head1 SYNOPSYS
|
|
|
|
Usage: kleingebaeck [-dvVhmoc] [<ad-listing-url>,...]
|
|
Options:
|
|
-u --user <uid> Backup ads from user with uid <uid>.
|
|
-d --debug Enable debug output.
|
|
-v --verbose Enable verbose output.
|
|
-o --outdir <dir> Set output dir (default: current directory)
|
|
-l --limit <num> Limit the ads to download to <num>, default: load all.
|
|
-c --config <file> Use config file <file> (default: ~/.kleingebaeck).
|
|
--ignoreerrors Ignore HTTP errors, may lead to incomplete ad backup.
|
|
-f --force Overwrite images and ads even if the already exist.
|
|
-m --manual Show manual.
|
|
-h --help Show usage.
|
|
-V --version Show program version.
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
This tool can be used to backup ads on the german ad page L<https://kleinanzeigen.de>.
|
|
|
|
It downloads all (or only the specified ones) ads of one user into a
|
|
directory, each ad into its own subdirectory. The backup will contain
|
|
a textfile B<Adlisting.txt> which contains the ad contents such as
|
|
title, body, price etc. All images will be downloaded as well.
|
|
|
|
=head1 CONFIGURATION
|
|
|
|
You can create a config file to save typing. By default
|
|
C<~/.kleingebaeck> is being used but you can specify one with C<-c> as
|
|
well. We use TOML as our configuration language. See
|
|
L<https://toml.io/en/>.
|
|
|
|
Format is pretty simple:
|
|
|
|
user = 1010101
|
|
loglevel = verbose
|
|
outdir = "test"
|
|
useragent = "Mozilla/5.0"
|
|
template = """
|
|
Title: {{.Title}}
|
|
Price: {{.Price}}
|
|
Id: {{.ID}}
|
|
Category: {{.Category}}
|
|
Condition: {{.Condition}}
|
|
Created: {{.Created}}
|
|
|
|
{{.Text}}
|
|
"""
|
|
|
|
Be careful if you want to change the template. The variable is a
|
|
multiline string surrounded by three double quotes. You can left out
|
|
certain fields and use any formatting you like. Refer to
|
|
L<https://pkg.go.dev/text/template> for details how to write a
|
|
template. Also read the TEMPLATES section below.
|
|
|
|
If you're on windows and want to customize the output directory, put
|
|
it into single quotes to avoid the backslashes interpreted as escape
|
|
chars like this:
|
|
|
|
outdir = 'C:\Data\Ads'
|
|
|
|
=head1 TEMPLATES
|
|
|
|
Various parts of the configuration can be modified using templates:
|
|
the output directory, the ad directory and the ad listing itself.
|
|
|
|
=head2 OUTPUT DIR TEMPLATE
|
|
|
|
The config varialbe C<outdir> or the command line parameter C<-o> take a
|
|
template which may contain:
|
|
|
|
=over
|
|
|
|
=item C<{{.Year}}>
|
|
|
|
=item C<{{.Month}}>
|
|
|
|
=item C<{{.Day}}>
|
|
|
|
=back
|
|
|
|
That way you can create a new output directory for every backup
|
|
run. For example:
|
|
|
|
outdir = "/home/backups/ads-{{.Year}}-{{.Month}}-{{.Day}}"
|
|
|
|
Or using the command line flag:
|
|
|
|
-o "/home/backups/ads-{{.Year}}-{{.Month}}-{{.Day}}"
|
|
|
|
The default value is C<.> - the current directory.
|
|
|
|
=head2 AD DIRECTORY TEMPLATE
|
|
|
|
The ad directory name can be modified using the following ad values:
|
|
|
|
=over
|
|
|
|
=item {{.Price}}
|
|
|
|
=item {{.ID}}
|
|
|
|
=item {{.Category}}
|
|
|
|
=item {{.Condition}}
|
|
|
|
=item {{.Created}}
|
|
|
|
=item {{.Slug}}
|
|
|
|
=item {{.Text}}
|
|
|
|
=back
|
|
|
|
It can only be configured in the config file. By default only
|
|
C<{{.Slug}}> is being used, this is the title of the ad in url format.
|
|
|
|
=head2 AD NAME TEMPLATE
|
|
|
|
The name of the directory per ad can be tuned as well:
|
|
|
|
=over
|
|
|
|
=item C<{{.Year}}>
|
|
|
|
=item C<{{.Month}}>
|
|
|
|
=item C<{{.Day}}>
|
|
|
|
=item C<{{.Slug}}>
|
|
|
|
=item C<{{.Category}}>
|
|
|
|
=item C<{{.ID}}>
|
|
|
|
|
|
=back
|
|
|
|
=head2 AD TEMPLATE
|
|
|
|
The ad listing itself can be modified as well, using the same
|
|
variables as the ad name template above.
|
|
|
|
This is the default template:
|
|
|
|
Title: {{.Title}}
|
|
Price: {{.Price}}
|
|
Id: {{.ID}}
|
|
Category: {{.Category}}
|
|
Condition: {{.Condition}}
|
|
Type: {{.Type}}
|
|
Created: {{.Created}}
|
|
Expire: {{.Expire}}
|
|
|
|
{{.Text}}
|
|
|
|
The config parameter to modify is C<template>. See example.conf in the
|
|
source repository. Please take care, since this is a multiline
|
|
string. This is how it shall look if you modify it:
|
|
|
|
template="""
|
|
Title: {{.Title}}
|
|
|
|
{{.Text}}
|
|
"""
|
|
|
|
That is, the content between the two C<"""> chars is the template.
|
|
|
|
=head1 SETUP
|
|
|
|
To setup the tool, you need to lookup your userid on
|
|
kleinanzeigen.de. Go to your ad overview page while NOT being logged
|
|
in:
|
|
|
|
https://www.kleinanzeigen.de/s-bestandsliste.html?userId=XXXXXX
|
|
|
|
The B<XXXXX> part is your userid.
|
|
|
|
Put it into the configfile as outlined above. Also specify an output
|
|
directory. Then just execute C<kleingebaeck>.
|
|
|
|
You can use the B<-v> option to get verbose output or B<-d> to enable
|
|
debugging.
|
|
|
|
=head1 ENVIRONMENT VARIABLES
|
|
|
|
The following environment variables are considered:
|
|
|
|
KLEINGEBAECK_USER
|
|
KLEINGEBAECK_DEBUG
|
|
KLEINGEBAECK_VERBOSE
|
|
KLEINGEBAECK_OUTDIR
|
|
KLEINGEBAECK_LIMIT
|
|
KLEINGEBAECK_CONFIG
|
|
KLEINGEBAECK_IGNOREERRORS
|
|
|
|
Please note, that they take precedence over config file, but
|
|
commandline flags take precedence over env!
|
|
|
|
|
|
|
|
=head1 BUGS
|
|
|
|
In order to report a bug, unexpected behavior, feature requests
|
|
or to submit a patch, please open an issue on github:
|
|
L<https://codeberg.org/scip/kleingebaeck/issues>.
|
|
|
|
Please repeat the failing command with debugging enabled C<-d> and
|
|
include the output in the issue.
|
|
|
|
=head1 LIMITATIONS
|
|
|
|
The C<kleingebaeck> doesn't currently check if it has downloaded a
|
|
file already, so it downloads everything again every time you execute
|
|
it. Be aware of it. This will change in the future.
|
|
|
|
Also there's currently no parallelization implemented. This will
|
|
change in the future.
|
|
|
|
=head1 LICENSE
|
|
|
|
Copyright 2023-2025 Thomas von Dein
|
|
|
|
This program is free software: you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation, either version 3 of the License, or
|
|
(at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program. If not, see L<http://www.gnu.org/licenses/>.
|
|
|
|
=head1 Author
|
|
|
|
T.v.Dein <tom AT vondein DOT org>
|
|
|
|
|
|
|
|
=cut
|