Planet Python

Last update: November 10, 2024 01:44 AM UTC

November 08, 2024

Real Python

The Real Python Podcast – Episode #227: New PEPs: Template Strings & External Wheel Hosting

Have you wanted the flexibility of f-strings but need safety checks in place? What if you could have deferred evaluation for logging or avoiding injection attacks? Christopher Trudeau is back on the show this week, bringing another batch of PyCoder's Weekly articles and projects.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 08, 2024 12:00 PM UTC

PyBites

A Practical Example of the Pipeline Pattern in Python

What is this pattern about?

The Pipeline design pattern (also known as Chain of Command pattern) is a flexible way to handle a sequence of actions, where each handler in the chain processes the input data and passes it to the next handler. This pattern is commonly used in scenarios involving data processing, web scraping, or middleware systems.

In this blog post, I’ll walk you through a specific example that leverages Python’s powerful functools.reduce and partial functions, along with the BeautifulSoup library for parsing HTML content. This code showcases the Pipeline pattern applied to HTML table extraction and processing.

What Does the Code Do?

The code defines a pipeline of data parsing steps for extracting and cleaning tables from an HTML file. It follows a functional programming approach to compose several processing functions into a unified flow using the Chain of Command pattern.

Key Concepts

Functional Composition: Combining multiple functions into one that executes in a specific order.
Data Parsing Pipeline: Sequential processing of HTML content into structured data (a DataFrame).
Error Handling: Ensuring the pipeline gracefully handles missing or malformed data.

Let’s break down the code step by step:

1. Function Composition with `compose`

from functools import reduce, partial
from typing import Callable

The pipeline is created by composing multiple parsing functions into a single unified function. The compose function uses reduce to chain these functions together:

def compose(*functions: ParsingPipeline) -> ParsingPipeline:
    """Composes functions into a single function"""
    return reduce(lambda f, g: lambda x: g(f(x)), functions, lambda x: x)

This allows you to define an ordered flow of operations that process input data from one function to the next. Each function modifies the input data, which is then passed down the pipeline.

2. Reading HTML Content

The first step in the pipeline is to read the contents of an HTML file. This is done by read_htm_from:

def read_htm_from(filename: T, mode: T = "r", encoding: T = "utf-8") -> T:
    with open(filename, mode, encoding=encoding) as file:
        html_content = file.read()
    return html_content

This function opens an HTML file and returns its content as a string. It supports different file modes and encodings, making it flexible for various file formats.

Note that T is defined here as TypeVar("T"), see the typing docs.

3. Parsing the HTML Table

Next, read_table_from uses BeautifulSoup to find the HTML table within the file:

from bs4 import BeautifulSoup

def read_table_from(htm_file: T, parser: str = "html.parser") -> T:
    soup = BeautifulSoup(htm_file, parser)
    table = soup.find("table")
    return table

This function converts the HTML content into a BeautifulSoup object and extracts the first table it finds. The parsed table is passed down the pipeline for further processing.

4. Extracting Rows and Data

Once the table is identified, the pipeline extracts the rows and applies filtering logic based on custom markers:

def extract_row_data_from(
    table_rows: T, start_markers: T, continue_markers: T, end_markers: T
) -> T:
    row_data: T = []
    start_processing = False
    for row in table_rows:
        if any(marker in row.text for marker in start_markers) and not start_processing:
            start_processing = True
            continue
        if start_processing:
            if any(marker in row.text for marker in continue_markers):
                continue
            if any(marker in row.text for marker in end_markers):
                break
            row_data.append(row)
    return row_data[:-1]

This function inspects each row in the table, checking if the row text matches specified start, continue, or end markers. Data extraction begins after encountering the start marker and ends when the end marker is found.

5. Converting Rows to DataFrame

The next steps involve transforming the extracted row data into a structured pandas DataFrame. First, the rows are separated into individual columns using separate_columns_in:

def separate_columns_in(rows: T) -> T:
    data_rows: T = []
    try:
        for row in rows:
            columns = row.find_all(["td", "th"])
            data = [col.text for col in columns]
            data_rows.append(data)
        return data_rows
    except Exception as e:
        print(f"An error occurred: {str(e)}")
        return []

Then, convert_to_dataframe reshapes this data into a pandas DataFrame:

def convert_to_dataframe(data_rows: T) -> T:
    df = pd.DataFrame(data_rows)
    df = df.rename(columns=df.iloc[0]).drop(df.index[0])
    df.columns = COLUMN_NAMES
    df.drop(columns=COLUMNS_TO_REMOVE, axis=1, inplace=True)
    df.set_index(df.columns[0], inplace=True, drop=True)
    return df

The DataFrame is cleaned up by renaming columns, removing unnecessary columns, and setting the correct index.

6. Assigning Correct Data Types

Finally, assign_correct_data_type_to ensures that the DataFrame columns have the appropriate data types:

def assign_correct_data_type_to(
    df: T,
    dict_types: dict[str, str] = COLUMN_TYPES,
    datetime_columns: list[str] = DATETIME_COLUMN_NAMES,
) -> T:
    if not isinstance(df, pd.DataFrame):
        raise ValueError("Input `df` must be a pandas DataFrame.")
    df = df.copy()
    for column in datetime_columns:
        if column in df.columns:
            df[column] = pd.to_datetime(df[column])
    for column, col_type in dict_types.items():
        if column in df.columns:
            try:
                if col_type == "numeric":
                    df[column] = pd.to_numeric(df[column], errors="coerce")
                else:
                    df[column].astype(col_type)
            except Exception as e:
                print(f"Error converting column {column} to {col_type}: {e}")
    return df

This function converts columns into numeric or datetime formats as needed, ensuring that the data is properly structured for analysis.

7. Putting It All Together

At the end of the code, the pipeline is composed by chaining all of the above functions together:

parse_gbx_bt: ParsingPipeline = compose(
    partial(read_htm_from, mode="r", encoding="utf-8"),
    read_table_from,
    read_rows_from,
    partial(
        extract_row_data_from,
        start_markers=["Closed Transactions:"],
        continue_markers=["Genbox", "balance", "Deposit"],
        end_markers=["Closed P/L:"],
    ),
    separate_columns_in,
    convert_to_dataframe,
    assign_correct_data_type_to,
)

This creates a fully automated pipeline that:

Reads an HTML file.
Extracts table data.
Cleans and converts the data into a pandas DataFrame.
Assigns the correct data types.

Conclusion

This implementation of the Chain of Command or Pipeline pattern in Python demonstrates how to apply functional programming principles to data parsing tasks. The use of functools.reduce and partial, and BeautifulSoup provides a flexible, reusable way to process HTML content and structure it into usable data.

If you’re looking to create complex data processing pipelines that need to handle dynamic data from HTML or other sources, this approach is a clean and maintainable solution.

You can find the code in the repo: https://github.com/jjeg1979/pyBacktestAnalyzer.

And if you want to watch the code clinic where I presented the tool, feel free to check it out at https://pybites.circle.so/c/circle-coaching-calls/python-for-the-trader-code-clinic.

If you cannot access…well, what are you waiting for to become a PDM member?

November 08, 2024 09:51 AM UTC

Julien Tayon

The crudest CRUD of them all : the smallest CRUD possible in 150 lines of python

Right now, I am on a never ending quest that requires me to think of building a full fledge MVC controller : an anti-jira tracker that would favours HARD CHECKED facts over wishful thinking.

For this to begin, I am not really motivated in beginning with a full fledged MVC (Model View Controller) à la django because there is a lot of boilerplates and actions to do before a result. But, it has a lot of feature I want, including authentication, authorization and handling security.

For prototypes we normally flavours lightweight framework (à la flask), and CRUD.

CRUD approach is a factorisation of all framework in a single dynamic form that adapts itself to the model to generate HTML forms to input data, tabulate, REST endpoints and search them from the python class declaration and generate the database model. One language to rule them all : PYTHON. You can easily generate even the javascript to handle autocompletion on the generated view from python with enough talent.

But before using a CRUD framework, we need a cruder one, ugly, disgusting but useful for a human before building the REST APIs, writing the class in python, the HTML form, and the controlers.

I call this the crudest CRUD of them all.

Think hard at what you want when prototyping ...

to write no CONTROLLERS ; flask documentation has a very verbose approach to exposing routes and writing them, writing controller for embasing and searching databases is boring
to write the fewer HTML views possible, one and only onle would be great ;
to avoid having to fiddle the many files reflecting separation of concerns : the lesser python files and class you touch the better;
to avoid having to write SQL nor use an ORM (at least a verbose declarative one) ;
show me your code and you can mesmerize and even fool me, however show me your data structure and I'll know everthing I have to know about your application : data structure should be under your nose in a readable fashion in the code;/
to have AT LEAST one end point for inserting and searching so that curl can be used to begin automation and testing, preferably in a factorisable fashion;
only one point of failure is accepted

Once we set these few condition we see whatever we do WE NEED a dynamic http server at the core. Python being the topic here, we are gonna do it in python.

What is the simplest dynamic web server in python ?

The reference implementation of wsgi that is the crudest wsgi server of them all : wsgiref. And you don't need to download it since it's provided in python stdlib.

First thing first, we are gonna had a default view so that we can serve an HTML static page with the list of the minimal HTML we need to interact with data : sets of input and forms.

Here, we stop. And we see that these forms are describing the data model.

Wouldn't it be nice if we could parse the HTML form easily with a tool from the standard library : html.parser and maybe deduce the database model and even more than fields coud add relationship, and well since we are dreaming : what about creating the tables on the fly from the form if they don't exists ?

The encoding of the relationship do require an hijack of convention where when the parser cross a name of the field in the form whatever_id it deduces it is a foreign key to table « whatever », column « id ».
Once this is done, we can parse the html, do some magick to match HTML input types to database types (adapter) and it's almost over. We can even dream of creating the database if it does not exists in a oneliner for sqlite.

We just need to throw away all the frugality of dependencies by the window and spoil our karma of « digital soberty » by adding the almighty sqlalchemy the crudest (but still heavy) ORM when it comes of the field of the introspective features of an ORM to map a database object to a python object in a clear consistent way. With this, just one function is needed in the controller to switch from embasing (POST method) and searching (GET).

Well, if the DOM is passed in the request. So of course I see the critics here :

we can't pass the DOM in the request because the HTML form ignores the DOM
You are not scared of error 415 (request too large) in the get method if you pass the DOM ?

That's where we obviously need two important tools : 1) javascript, 2) limitations.

Since we are human we would also like the form to be readable when served, because, well, human don't read the source and can't see the name attributes of the input. A tad of improving the raw html would be nice. It would also give consistency. It will also diminishes the required size of the formular to send. Here, javascript again is the right anwser. Fine, we serve the static page in the top of the controller. Let's use jquery to make it terse enough. Oh, if we have Javascript, wouldn't il be able to clone the part of the invented model tag inside every form so now we can pass the relevant part of the DOM to the controller ?

I think we have everything to write the crudest CRUD server of them all :D

Happy code reading :

import multipart
from wsgiref.simple_server import make_server
from json import dumps
from sqlalchemy import *
from html.parser import HTMLParser
from base64 import b64encode
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from dateutil import parser
from sqlalchemy_utils import database_exists, create_database
from urllib.parse import parse_qsl, urlparse

engine = create_engine("sqlite:///this.db")
if not database_exists(engine.url):
    create_database(engine.url)

tables = dict()

class HTMLtoData(HTMLParser):
    def __init__(self):
        global engine, tables
        self.cols = []
        self.table = ""
        self.tables= []
        self.engine= engine
        self.meta = MetaData()
        super().__init__()

    def handle_starttag(self, tag, attrs):
        attrs = dict(attrs)
        simple_mapping = dict(
            email = UnicodeText, url = UnicodeText, phone = UnicodeText, text = UnicodeText,
            date = Date, time = Time, datetime = DateTime, file = Text
        )
        if tag == "input":
            if attrs.get("name") == "id":
                self.cols += [ Column('id', Integer, primary_key = True), ]
                return
            try:
                if attrs.get("name").endswith("_id"):
                    table,_=attrs.get("name").split("_")
                    self.cols += [ Column(attrs["name"], Integer, ForeignKey(table + ".id")) ]
                    return
            except Exception as e: print(e)

            if attrs.get("type") in simple_mapping.keys():
                self.cols += [ Column(attrs["name"], simple_mapping[attrs["type"]]), ]

            if attrs["type"] == "number":
                if attrs["step"] == "any":
                    self.cols+= [ Columns(attrs["name"], Float), ]
                else:
                    self.cols+= [ Column(attrs["name"], Integer), ]
        if tag== "form":
            self.table = urlparse(attrs["action"]).path[1:]

    def handle_endtag(self, tag):
        if tag=="form":
            self.tables += [ Table(self.table, self.meta, *self.cols), ]
            tables[self.table] = self.tables[-1]
            self.table = ""
            self.cols = []
            with engine.connect() as cnx:
                self.meta.create_all(engine)
                cnx.commit()
html = """
<!doctype html>
<html>
<head>
<style>
* {    font-family:"Sans Serif" }
body { text-align: center; }
fieldset {  border: 1px solid #666;  border-radius: .5em; width: 30em; margin: auto; }
form { text-align: left; display:inline-block; }
input { margin-bottom:1em; padding:.5em;}
[value=create] { background:#ffffba} [value=delete] { background:#bae1ff} [value=update] { background:#ffdfda}
[value=read] { background:#baffc9}
[type=submit] { margin-right:1em; margin-bottom:0em; border:1px solid #333; padding:.5em; border-radius:.5em; }
</style>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.7.1/jquery.min.js"></script>
<script>
$(document).ready(function() {
    $("form").each((i,el) => {
        $(el).wrap("<fieldset></fieldset>"  );
        $(el).before("<legend>" + el.action + "</legend>");
        $(el).append("<input name=_action type=submit value=create ><input name=_action type=submit value=read >")
        $(el).append("<input name=_action type=submit value=update ><input name=_action type=submit value=delete >")
    });
    $("input:not([type=hidden],[type=submit])").each((i,el) => {
        $(el).before("<label>" + el.name+ "</label><br/>");
        $(el).after("<br>");
    });
});
</script>
</head>
<body >
    <form  action=/user method=post >
        <input type=number name=id />
        <input type=text name=name />
        <input type=email name=email />
    </form>
    <form action=/event method=post >
        <input type=number name=id />
        <input type=date name=from_date />
        <input type=date name=to_date />
        <input type=text name=text />
        <input type=number name=user_id />
    </form>
</body>
</html>

"""


router = dict({"" : lambda fo: html,})

def simple_app(environ, start_response):
    fo, fi=multipart.parse_form_data(environ)
    fo.update(**{ k: dict(
            name=fi[k].filename,
            content_type=fi[k].content_type,
            content=b64encode(fi[k].file.read())
        ) for k,v in fi.items()})
    table = route = environ["PATH_INFO"][1:]
    fo.update(**dict(parse_qsl(environ["QUERY_STRING"])))
    HTMLtoData().feed(html)
    metadata = MetaData()
    metadata.reflect(bind=engine)
    Base = automap_base(metadata=metadata)
    Base.prepare()
    attrs_to_dict = lambda attrs : {  k: (
                    "date" in k or "time" in k ) and type(k) == str
                        and parser.parse(v) or
                    "file" in k and f"""data:{fo[k]["content_type"]}; base64, {fo[k]["content"].decode()}""" or v
                    for k,v in attrs.items() if v and not k.startswith("_")
    }
    if route in tables.keys():
        start_response('200 OK', [('Content-type', 'application/json; charset=utf-8')])
        with Session(engine) as session:
            try:
                action = fo.get("_action", "")
                Item = getattr(Base.classes, table)
                if action == "delete":
                    session.delete(session.get(Item, fo["id"]))
                    session.commit()
                    fo["result"] = "deleted"
                if action == "create":
                    new_item = Item(**attrs_to_dict(fo))
                    session.add(new_item)
                    session.flush()
                    ret=session.commit()
                    fo["result"] = new_item.id
                if action == "update":
                    session.delete(session.get(Item, fo["id"]))
                    new_item = Item(**attrs_to_dict(fo))
                    session.add(new_item)
                    session.commit()
                    fo["result"] = new_item.id
                if action in { "read", "search" }:
                    result = []
                    for elt in session.execute(
                        select(Item).filter_by(**attrs_to_dict(fo))).all():
                        result += [{ k.name:getattr(elt[0], k.name) for k in tables[table].columns}]
                    fo["result"] = result
            except Exception as e:
                fo["error"] = e
                session.rollback()
    else:
        start_response('200 OK', [('Content-type', 'text/html; charset=utf-8')])

    return [ router.get(route,lambda fo:dumps(fo.dict, indent=4, default=str))(fo).encode() ]

print("Crudest CRUD of them all on port 5000...")
make_server('', 5000, simple_app).serve_forever()

November 08, 2024 06:54 AM UTC

Armin Ronacher

What if My Tribe Is Wrong?

I wrote in the past about how I'm a pessimist that strives for positive outcomes. One of the things that I gradually learned is is wishing others to succeed. That is something that took me a long time to learn. I did not see the value in positive towards other people's success, but there is. There is one thing to be sceptical to a project or initiative, but you can still encourage the other person and wish them well.

I think not wishing others well is a coping mechanism of sorts. For sure it was for me. As you become more successful in life, it becomes easier to be supportive, because you have established yourself in one way or another and you feel more secure about yourself.

That said, there is something I continue to struggle with, and that are morals. What if the thing the other person is doing seems morally wrong to me? I believe that much of this struggle stems from the fear of feeling complicit in another's choices. Supporting someone — even passively — can feel like tacit approval, and that can be unsettling. Perhaps encouragement doesn't need to imply agreement. Another angle to consider is that my discomfort may actually stem from my own insecurities and doubts. When someone's path contradicts my values, it can make me question my own choices. This reaction often makes it hard to wish them well, even when deep down I want to.

What if my tribe is just wrong on something? I grew up with the idea of “never again”. Anything that remotely looks like fascism really triggers me. There is a well known propaganda film from the US Army called “Don't Be a Sucker” which warns Americans about the dangers of prejudice, discrimination, and fascist rhetoric. I watched this a few times over the years and it still makes me wonder how people can fall for that kind of rhetoric.

But is it really all that hard? Isn't that happening today again? I have a very hard time supporting what Trump or Musk are standing for or people that align with them. Trump's rhetoric and plans are counter to everything I stand for and the remind me a lot of that film. It's even harder for me with Musk. His morals are completely off, he seems to a person I would not want to be friends with, yet he's successful and he's pushing humanity forward.

It's challenging to reconcile my strong opposition to their (and other's) rhetoric and policies with the need to maintain a nuanced view of them. Neither are “literal Hitler”. Equating them with the most extreme historical figures oversimplifies the situation and shuts down productive conversation.

Particularly watching comedy shows reducing Trump to a caricature feels wrong to me. Plenty of his supporters have genuine concerns. I find it very hard to engage with these complexities and it's deeply uncomfortable and quite frankly exhausting.

Life becomes simpler when you just pick a side, but it will strip away the deeper understanding and nuance I want to hold onto. I don’t want to fall into the trap of justifying or defending behaviors I fundamentally disagree with, nor do I want to completely shut out the perspectives of those who support him. This means accepting that people I engage with, might see things very differently, and that maintaining those relationships and wishing them well them requires a level of tolerance I'm not sure I possess yet.

The reason it's particularly hard to me that even if I accept that my tribe maybe wrong in parts, I can see the effects that Trump and others already had on individuals. Think of the Muslim travel ban which kept families apart for years, his border family separation policy, the attempted repeal of Section 230. Some of it was not him, but people he aligned with. Things like the overturning of Roe v. Wade and the effects it had on women, the book bans in Florida, etc. Yes, not quite Hitler, but still deeply problematic for personal freedoms. So I can't ignore the harm that some of these policies have caused in the past and even if I take the most favorable view of him, I have that track record to hold against him.

In the end where does that leave me? Listening, understanding, and standing firm in my values. But not kissing the ring. And probably coping by writing more.

November 08, 2024 12:00 AM UTC

Michael Foord

Current Generative AI and the Future

AIMeme

I’ve seen this meme a bunch of times recently. I always reply; what is asserted without evidence may be dismissed without consideration.

Current Gen AI is flawed by hallucination issues, mired in copyright controversy, expensive to run and lacking clear use cases. (Although it’s pretty good at code generation). It’s a massive hype train.

Gen AI, as it is now, was made possible by the invention of “Transformer Architecture” by Google in 2017. We’re seeing fast paced change and development, but all built on that technology.

At some point another quantum breakthrough will change things all over again - and make another step towards AGI. Although it will take several such steps, and order of magnitudes larger models (and multi models), to create anything resembling true AI.

So a huge number of disparate individuals, institutions, governments and companies are pursuing the development of AI. There’s no single cohesive agenda behind it. As new technologies arise we adapt to them, find uses for them, and everyone pursues their agendas with them.

Not particularly special to AI I don’t think.

November 08, 2024 12:00 AM UTC

Python Metaclasses in Eight Words

Metaclasses

Python metaclasses, considered advanced programming and Python “black magick” (*) explained in eight words:

The type of a class is a class.

Here’s what knowledge of Object Oriented theory and type systems permit you to deduce from this:

Using the word “class”, instead of “the type of a class is type” or even “the type of a class is a type, classes are types”, implies that a user defined class can be a metaclass. This is indeed the case, and the point of metaclasses in Python.

The type is responsible for creating new instances. So if the type of a class is a class then we can write classes that create classes. Indeed this is the primary usecase for metaclasses.

(Deeper knowledge of Python, and the two phase object creation protocol, may lead you to deduce that this is done by overriding the __new__ method. If you’re familiar with “type” as a class factory you can probably even guess the signature and that you must inherit from type.)

If the type of a class is a class then the type system will permit a type check for the class against its class. And indeed isinstance(klass, metaclass) returns true.

(And deeper knowledge of Python will tell you that the magic methods, the protocol methods, are always looked up on the type. So we can implement behaviour for class objects by providing magic methods on the metaclass.)

All of this implies that classes are themselves objects. Which is true in Python for everything is an object in Python (and everything is a reference).

And so on…

Type and class are synonyms in Python.
type(type) is type

And to further round out the type system, these are also Python invariants:

isinstance(object, object) is True # object is an object
isinstance(object, type) is True # but also a type
isinstance(type, object) is True # type is an object
isinstance(type, type) is True # but also a type

(*) Like all black magick it is useful for understanding the world but never for actual use. Well, except perhaps in very rare circumstances if you know what you’re doing.

November 08, 2024 12:00 AM UTC

Some Personal History with Python

StaticTyping

📘 Written in 2021.

IronPython in Action was published on the 7th April 2009 and we sold a little over 7000 copies.

Royalties for last quarter amounted to $25.

It took me two years to write thirteen chapters and a couple of appendices, and took Christian Muirhead about the same to write two chapters and an appendix. Jonathan Hartley did the diagrams and illustrations and the worst part was compiling the index.

It took so long because IronPython was still in alpha (!) when we started and it changed several times (including a Silverlight version being released) whilst writing!

After leaving Resolver Systems in 2010 I spent a year contracting on Line of Business apps that ran in Silverlight (Django on the server): Python code running in the browser on the client side. It was glorious.

We even had functional tests on unittest built in to the app.

Work on mock accelerated massively once IronPython in Action was complete. MagickMock was born not long afterwards.

I was also helping maintain the python.org website and adding test discovery to unittest at the time, and speaking at every conference I could find.

It felt like the glory days of the Python community. It’s almost time for PyCon (online) and I’m nostalgic once again.

My first PyCon, the second Dallas PyCon and my first time in the US, there were about 600 attendees. You could almost know everyone.

I shaved my beard to enter Dallas and wore my hair in a pony tail. All I knew was they didn’t like hippies there. It was the nicest greeting at a US airport I’ve ever had.

I went on a road trip with Andrzej Krzywda afterwards trying to find mountains. We found the Ouchita mountains in Oaklahoma and drove back through Arkansas to visit friends of mine in Houston. Along the peaks of the mountains, which are hills really, we found a view called Dead Man’s Vista and we I laughed together at Microsoft.

Not long after this the web explosion happened and Django happened, google adopted Python as an official language and the community started to explode and grow.

That was even before Python became huge as a teaching language and before Python exploded in data science too.

I once paired with Jacob Kaplan Moss at a PyCon sprint and fixed some issue by adding a metaclass to the Django codebase. Which he never committed and found a better way.

That’s the closest I’ve come to deploying a metaclass I think, although I’ve removed a few in my time.

I knew Python had “made it” as a language when one bag stuffing pre-PyCon I met someone who didn’t want to be there. He’d been sent by work. Before that Python was obscure, and only people who really loved it went to PyCon. Which I’m convinced is the secret of Python’s success.

It was built by passion not by money. For the sheer love and the joy of building something beautiful with other people.

I was a Mac user then and had a running joke with Jonathan Hartley about Linux and projectors.

One time he plugged his laptop into the projector prior to his PyCon talk (Testing is a Silver Bullet), tried to fix the x-config from the terminal and rendered his laptop unusable. He did the presentation on mine. The next year Mark Shuttleworth did a keynote talk at PyCon and running some bleeding edge version of Ubuntu also couldn’t plug it into the projector system. Hilarity on my part.

The biggest conference I ever spoke at was a Microsoft one in Brighton where they demoed Silverlight and I demoed IronPython on Silverlight. They didn’t tell me I would be on main stage in front of a few thousand Microsoft devs. I was used to talking to a few hundred at a time!

I had a slide deck built from S5 with reStructured Text markup and a Far Side slide mocking static typing. Which went down a bomb to an audience of C# devs. I still managed, by coincidence, to demo almost the same features of Silverlight as Microsoft bigwig Scott Hanselman who did the keynote.

It was an “interesting experience”, evangelising Python and dynamic languages in “the heart of the beast” as it were. Microsoft went on to step up their involvement with Python and sincere Open Source commitments which they’ve maintained since.

Since I first wrote this Python has finally made it, ranked as the most widely used programming language in the world by TIOBE and PyPL. World number one.

I joined Twitter fourteen years ago and have tweeted over fifty-two thousand times. I follow 1,636 accounts, which is too many, and have 8,670 followers. I use Tweetdeck which is run by Twitter and doesn’t show ads or promoted tweets or mess with tweet order and it lets me use two different accounts.

I use twitter a lot less than I did during my social media and community frenzy whilst I delighted to learn Python, but I still enjoy it.

During that time (2006-2011) I “drank from the firehose”. I read all of slashdot (scanned every headline and read relevant articles), read all of comp.lang.python (every message title - read and replied to many), read all of python-dev (similarly) and all of testing-in-python, blogged almost daily and worked full time as a software engineer commuting to London four times a week and developed mock in my spare time and worked on unittest in the Python standard library. And wrote a book and worked part time doing community liaison and service development for a local charity working with the homeless and disadvantaged. I was Microsoft MVP for three years for my work with IronPython, I spoke at countless conferences and received the Python Software Foundation Community Award for my work running Planet Python and helping out with the Python.org website and mailing infrastructure.

Then in 2011 my first child was born and I started working for Canonical. Three years of large Django web applications then three years of Go and MongoDB and then a year with Red Hat testing Ansible Tower and now four years self employed.

During that time I remembered that the primary drive in my life was spiritual and I started meditating again. One hour a day of mindfulness of breathing. That transformed my life all over again.

I once rode in the back of a beaten up station wagon owned and operated by the creator of the Python programming language whilst sat alongside the creator of Bitorrent, which was written in Python.

I also once had a pub lunch in Oxford with the creator of the Erlang programming language and the creator of the Haskell programming language. We were all three speaking at the ACCU conference. I was speaking on IronPython.

It’s been a fun journey.

November 08, 2024 12:00 AM UTC

November 07, 2024

Python Software Foundation

PSF Grants Program Updates: Workgroup Charter, Future, & Refresh (Part 2)

Building on Part 1 of this PSF Grants Program Update, we are pleased to share updates to the Grants Workgroup (workgroup) Charter. We have outlined all the changes below in a chart, but there are a couple of changes that we’d like to highlight to grant applicants. These updates in particular will change how and when you apply, and hopefully reduce blockers to getting those applications in and ready for review. Because we are just sharing these updates, we are happy to be flexible on these changes but hope to see all applicants adhere to the changes starting around January 2025.

Increase overall process time frame to 8 weeks (formerly 6 weeks). We want to be realistic about how long the process takes and we know that going over our projection can cause pain for applicants. We hope to turn around applications in 6 weeks in most cases, but planning for the extra two weeks can make a big difference for everyone involved!

Our application form requires that you set the event date out to 6 weeks in advance. We will wait to update that to 8 weeks in advance until January 2025.
It’s important to note that this time frame begins only once all required information has been received, not exactly from the day the application is submitted. Make sure to check the email you provided on the application to see if the workgroup Chair has any questions regarding your request!

Add a statement of support for accessibility services. In line with the PSF’s mission to support and facilitate the growth of a diverse community, we are explicitly stating in the charter that we will consider funding accessibility services. For established events (have 2 or more events in the past with more than 200 participants at the last event), we are open to considering accessibility-related requests such as live captioning, sign language interpretation, or certified child care.

To review these types of requests, we will need sufficient documentation such as quotes, certifications, or any other relevant information.

Add guidelines around program/schedule review. Previously undocumented, we were checking event programs/schedules to ensure a Python focus as well as a diversity of speakers. Because of event organizing time frames, we often received grant requests before the schedule was available. Moving forward we are accepting 1 of 3 options:

The program/schedule for the event
A tentative schedule or list of accepted speakers/sessions for the event
Programs from previous editions of the event if available, a link to the event’s call for proposals, which should state a required Python focus for the event as well as a statement in support of a diverse speaker group, and a description of the efforts that are being made to ensure a diversity of speakers.

Grants Workgroup Charter Updates

Update Summary	Former Charter	Projected Benefit	Observations
Establish fast-track grants: Grants that meet pre-approved criteria skip the review period with the workgroup and go straight to a vote	Did not exist previously	Resolutions reach applicants sooner, reduce load on workgroup	Not many events meet the initial criteria we set to qualify for fast-track review, so this is mostly untested
Establish workgroup participation criteria: workgroup members must participate in 60% of the votes to remain active	Did not exist previously	Resolutions reach applicants sooner, set out clear guidelines on the meaning of active participation, reduce load on Chair	Reduction of workgroup membership to only active members has resulted in shorter voting periods by removing blockers to meeting quorum
Increase $ amount for PSF Board review: Grant requests over 15K are reviewed by PSF Board	Grant requests over 10K were reviewed by PSF Board	Resolutions reach applicants sooner, reduces load on PSF Board to ensure they are focused on high level efforts	Resolutions have reached applicants sooner, some reduction in load for PSF Board as we are still receiving applications over 15K
Increase process timeframe: 8 week processing time from when all information has been received	6 week processing time from when all information has been received	Improves community satisfaction and sets realistic expectations, reduces stress on workgroup & Chair	We are just sharing this update so it has yet to be tested- come to our Grants Office Hour session to discuss it with us!
Establish schedule for Grant review process: 10 day review period and 10 day voting period	Did not exist previously	Improve community satisfaction by ensuring requests are moving through the process promptly	This has worked great to keep things moving as the workgroup has a set expectation of how long they have to comment
Establish guideline for workgroup process: no discussion after the vote has begun	Did not exist previously	Improve community satisfaction by ensuring requests are moving hrough the process promptly	While untested, this has set an expectation for the workgroup to comment during the review period
Update voting mechanics: Votes will last for 10 days, a majority is reached, or when all voting members have voted, whichever comes first. For a proposal to be successful, it must have ayes in the majority totalling 30% of the WG	Decisions were made by a majority rule (50%+1), with no time limit	Improve community satisfaction by ensuring votes take place promptly, reduce stress on the workgroup and Chair if members are absent or unable to participate	This has worked wonderfully! The Chair no longer has to track down votes. Paired with the participation guideline, voting periods no longer present a bottleneck
Removed stated set budget: The annual budget is set by the PSF Board and is subject to change	The previously documented budget was $120,000 (regularly exceeded)	Removes an inaccurate description of the Grants Program budget and the need to update this line yearly	A practical update, no observations to note
Update workgroup officer roles: one Chair, one Vice Chair, one Appointed Board Director	One Chair and two Vice Chairs	Correct an unusual and discouraged practice of having two vice chairs and ensures PSF Board participation	A practical update, no observations to note
Add a statement of support for accessibility services: for mature events, consideration of granting funds for accessibility services	Did not exist previously	Establishes criteria for the workgroup and Board to consider accessibility-related requests	We are just sharing this update so it has yet to be tested- come to our Grants Office Hour session to discuss it with us!
Additional guidelines around grant reviews: tentative schedules OR previous schedules, CfP that shows a Python focus, as well as a description of the efforts being made to ensure a diversity of speakers	Did not exist previously in documented form, though we checked for a program	Improve community satisfaction with the process, remove delays in the grant review process	This has been a great addition, and blockers for many applications have been removed!

.table { display: block; overflow-y: hidden; overflow-x: auto; scroll-behavior: smooth; } .table thead { display: table-header-group; vertical-align: middle; border-color: inherit; color: inherit; background: darkcyan; } table, th, td { border: 1px solid black; border-collapse: collapse; } tr { display: table-row; vertical-align: inherit; border-color: inherit; } table th { padding: 16px; text-align: inherit; border-bottom: 1px solid black; color:white!important; } tbody { display: table-row-group; vertical-align: middle; border-color: inherit; } table:not(.tr-caption-container) { min-width: 100%; border-radius: 3px; }

What’s next?

Still on our Grants Program refresh to-do list is:

Mapping Board-mandated priorities for the Grants Program to policy
Charter adjustments as needed, based on the priority mapping
Main documentation page re-write
Budget template update
Application form overhaul
Transparency report for 2024
Exploration and development of other resources that our grant applicants would find useful

Our community is ever-changing and growing, and we plan to be there every step of the way and continue crafting the Grants Program to serve Pythonistas worldwide. If you have questions or comments, we welcome and encourage you to join us at our monthly Grants Program Office Hour sessions on the PSF Discord.

November 07, 2024 10:58 AM UTC

PSF Grants Program Updates: Workgroup Charter, Future, & Refresh (Part 1)

Time has flown by since we received the community call last December for greater transparency and better processes around our Grants Program. PSF staff have produced a Grants Program Transparency Report and begun holding monthly Grants Program Office Hours. The PSF Board also invested in a third-party retrospective and launched a major refresh of all areas of our Grants program.

To provide the Grants Program more support, we assigned Marie Nordin, PSF Community Communications Manager, to support the Grants Program alongside Laura Graves, Senior Accountant. Marie has stepped into the Grants Workgroup Chair role to relieve Laura after 3+ years– thank you, Laura! Marie has been leading the initiatives and work related to the Grants Program in collaboration with Laura.

Behind the scenes, PSF staff has been working with the PSF Board and the Grants Workgroup (workgroup) to translate the feedback we’ve received and the analysis we’ve performed into action, starting with the Grants Workgroup Charter. A full breakdown of updates to the charter can be found in Part 2 of this update.

The PSF Board spent time on their recent retreat to explore priorities for the program going forward. We also ran a more thorough workgroup membership renewal process based on the updated charter to support quicker grant reviews and votes through active workgroup engagement. We’re excited to share refresh progress, updates, and plans for the future of the program later on in this post!

Something wonderful, bringing more changes

Meanwhile, the attention our Grants Program has received in the past year has resulted in something wonderful: we’re getting more requests than ever. Our call to historically underrepresented regions to request funds has been answered in some areas- and we are thrilled! For example, in the African region, we granted around 65K in 2023 and over 140K already this year! And, year to date in 2024 we have awarded more grant funding than we did in all of 2023. The other side of this coin presents us with a new issue– the budget for the program.

Up until this year, we’ve been able to grant at least partial funding to the majority of requests we’ve received while staying within our guidelines and maintaining a feasible annual budget. With more eligible requests incoming, every “yes” brings us closer to the ceiling of our grant budget. In addition to the increased quantity of requests, we are receiving requests for higher amounts. Inflation and the tech crunch have been hitting event organizers everywhere (this includes the PSF-produced PyCon US), and we are seeing that reflected in the number and size of the grant requests we are receiving.

Moving forward, with the increased quantity and amount of eligible grant requests, we will need to take steps to ensure we are balancing grant awards with sustainability for our Grants Program, and the Foundation overall. We know that the most important part of any changes to the Grants Program is awareness and two-way communications with the community. We aim to do that as early and transparently as we possibly can. That means we aren’t changing anything about how we award grants today or even next week– but within the next couple of months. Please keep an eye on our blog and social accounts (Mastodon, X, LinkedIn) for news about upcoming changes, and make sure to share this post with your fellow Python event and initiative organizers.

Grants Workgroup Charter update process

The purpose of the PSF Grants Workgroup (workgroup) is to review, approve, and deny grant funding proposals for Python conferences, training workshops, Meetups, development projects, and other related Python initiatives. The workgroup charter outlines processes, guidelines, and membership requirements for the workgroup. Small changes have been made to the charter over the years, but it’s been some time since any significant changes were implemented.

During the summer of 2024, Marie, workgroup chair (hi 👋 it’s me writing this!), and Laura worked on updates for the charter. The updates focused on how to make the Grants Program processes and guidelines work better for the workgroup, the PSF Board, and most especially, the community we serve.

After many hours of discussing pain points, running scenarios, exploring possible guidelines, and drafting the actual wording, Marie and Laura introduced proposed updates for the charter to the Board in July. After a month of review and 1:1 meetings with the PSF Board and workgroup members, the updated charter went to a vote with the PSF Board on August 14th and was approved unanimously.

The workgroup has been operating under its new charter for a couple of months. Before we shared broadly with the community, we wanted to make sure the updates didn’t cause unintended consequences, and we were ready to walk back anything that didn’t make sense. Turns out, our hard work paid off, and the updates have been mostly working as we hoped. We will continue to monitor the impact of the changes and make any adjustments in the next Charter update. Read up on the Grants Workgroup Charter updates in Part 2 of this blog post!

November 07, 2024 10:56 AM UTC

November 06, 2024

Real Python

How to Reset a pandas DataFrame Index

In this tutorial, you’ll learn how to reset a pandas DataFrame index, the reasons why you might want to do this, and the problems that could occur if you don’t.

Before you start your learning journey, you should familiarize yourself with how to create a pandas DataFrame. Knowing the difference between a DataFrame and a pandas Series will also prove useful to you.

In addition, you may want to use the data analysis tool Jupyter Notebook as you work through the examples in this tutorial. Alternatively, JupyterLab will give you an enhanced notebook experience, but feel free to use any Python environment you wish.

As a starting point, you’ll need some data. To begin with, you’ll use the band_members.csv file included in the downloadable materials that you can access by clicking the link below:

Get Your Code: Click here to download the free sample code you’ll use to learn how to reset a pandas DataFrame index.

The table below describes the data from band_members.csv that you’ll begin with:

Column Name	PyArrow Data Type	Description
`first_name`	`string`	First name of member
`last_name`	`string`	Last name of member
`instrument`	`string`	Main instrument played
`date_of_birth`	`string`	Member’s date of birth

As you’ll see, the data has details of the members of the rock band The Beach Boys. Each row contains information about its various members both past and present.

Note: In case you’ve never heard of The Beach Boys, they’re an American rock band formed in the early 1960s.

Throughout this tutorial, you’ll be using the pandas library to allow you to work with DataFrames, as well as the newer PyArrow library. The PyArrow library provides pandas with its own optimized data types, which are faster and less memory-intensive than the traditional NumPy types that pandas uses by default.

If you’re working at the command line, you can install both pandas and pyarrow using the single command python -m pip install pandas pyarrow. If you’re working in a Jupyter Notebook, you should use !python -m pip install pandas pyarrow. Regardless, you should do this within a virtual environment to avoid clashes with the libraries you use in your global environment.

Once you have the libraries in place, it’s time to read your data into a DataFrame:

Python
      
>>> import pandas as pd

>>> beach_boys = pd.read_csv(
...     "band_members.csv"
... ).convert_dtypes(dtype_backend="pyarrow")
Copied!

First, you used import pandas to make the library available within your code. To construct the DataFrame and read it into the beach_boys variable, you used pandas’ read_csv() function, passing band_members.csv as the file to read. Finally, by passing dtype_backend="pyarrow" to .convert_dtypes() you convert all columns to pyarrow types.

If you want to verify that pyarrow data types are indeed being used, then beach_boys.dtypes will satisfy your curiosity:

Python
      
        
      
    
>>> beach_boys.dtypes
first_name            string[pyarrow]
last_name             string[pyarrow]
instrument            string[pyarrow]
date_of_birth         string[pyarrow]
dtype: object
Copied!

As you can see, each data type contains [pyarrow] in its name.

If you wanted to analyze the date information thoroughly, then you would parse the date_of_birth column to make sure dates are read as a suitable pyarrow date type. This would allow you to analyze by specific days, months or years, and so on, as commonly found in pivot tables.

The date_of_birth column is not analyzed in this tutorial, so the string data type it’s being read as will do. Later on, you’ll get the chance to hone your skills with some exercises. The solutions include the date parsing code if you want to see how it’s done.

Now that the file has been loaded into a DataFrame, you’ll probably want to take a look at it:

Python
      
        
      
    
>>> beach_boys
  first_name last_name instrument date_of_birth
0      Brian    Wilson       Bass   20-Jun-1942
1       Mike      Love  Saxophone   15-Mar-1941
2         Al   Jardine     Guitar   03-Sep-1942
3      Bruce  Johnston       Bass   27-Jun-1942
4       Carl    Wilson     Guitar   21-Dec-1946
5     Dennis    Wilson      Drums   04-Dec-1944
6      David     Marks     Guitar   22-Aug-1948
7      Ricky    Fataar      Drums   05-Sep-1952
8    Blondie   Chaplin     Guitar   07-Jul-1951
Copied!

DataFrames are two-dimensional data structures similar to spreadsheets or database tables. A pandas DataFrame can be considered a set of columns, with each column being a pandas Series. Each column also has a heading, which is the name property of the Series, and each row has a label, which is referred to as an element of its associated index object.

The DataFrame’s index is shown to the left of the DataFrame. It’s not part of the original band_members.csv source file, but is added as part of the DataFrame creation process. It’s this index object you’re learning to reset.

The index of a DataFrame is an additional column of labels that helps you identify rows. When used in combination with column headings, it allows you to access specific data within your DataFrame. The default index labels are a sequence of integers, but you can use strings to make them more meaningful. You can actually use any hashable type for your index, but integers, strings, and timestamps are the most common.

Note: Although indexes are certainly useful in pandas, an alternative to pandas is the new high-performance Polars library, which eliminates them in favor of row numbers. This may come as a surprise, but aside from being used for selecting rows or columns, indexes aren’t often used when analyzing DataFrames. Also, row numbers always remain sequential when rows are added or removed in a Polars DataFrame. This isn’t the case with indexes in pandas.

Read the full article at https://realpython.com/pandas-reset-index/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 06, 2024 02:00 PM UTC

Programiz

Python match…case Statement

The match…case statement allows us to execute different actions based on the value of an expression. In this tutorial, you will learn how to use the Python match…case with the help of examples.

November 06, 2024 10:10 AM UTC

Matt Layman

Deploy Your Own Web App With Kamal 2

Kamal offers zero-downtime deploys, rolling restarts, asset bridging, remote builds, accessory service management, and everything else you need to deploy and manage your web app in production with Docker. Originally built for Rails apps, Kamal will work with any type of web app that can be containerized. We dig into Kamal, how it works, and how you could use it on your next project.

November 06, 2024 12:00 AM UTC

November 05, 2024

TestDriven.io

Avoid Counting in Django Pagination

This article looks at how to avoid the count query in Django's paginator.

November 05, 2024 10:28 PM UTC

PyCoder’s Weekly

Issue #654 (Nov. 5, 2024)

#654 – NOVEMBER 5, 2024
View in Browser »

PySheets: Spreadsheets in the Browser Using PyScript

What goes into building a spreadsheet application in Python that runs in the browser? How do you make it launch quickly, and where do you store the cells of data? This week on the show, we speak with Chris Laffra about his project, PySheets, and his book “Communication for Engineers.”
REAL PYTHON podcast

Adding Keyboard Shortcuts to the Python REPL

Python 3.13 included a new version of the REPL which has the ability to define keyboard shortcuts. This article shows you how to create one and warns you about potential hangups.
TREY HUNNER

Tired of Being Paged? Worry Less With Temporal

Say goodbye to managing failures, network outages, flaky endpoints, and long-running processes. Temporal ensures your code never fails. Period. PLUS, you can get started today on Temporal Cloud with $1,000 free credits on us →
TEMPORAL TECHNOLOGIES sponsor

Running a Million Empty Tests

To better understand just where the performance cost of running tests comes from, Anders ran a million empty tests. This post talks about what he did and the final results.
ANDERS HOVMOLLER

Pillow Release 11.0.0

GITHUB.COM/PYTHON-PILLOW

PEP 750: Template Strings (Major Updates)

PSF

PEP 756: Add `PyUnicode_Export()` and `PyUnicode_Import()` C Functions (Withdrawn)

PSF

Discussions

Thinking of Rewriting Our Go / Java API in Python

Best GUI for Local Client App?

Articles & Tutorials

Move to Sigstore Complicates Linux Distros

Currently, CPython signs its artifacts with both PGP and Sigstore. Removing the PGP signature has been proposed, but that has implications: Sigstore is still new enough that many Linux distributions don’t support it yet.
JOE BROCKMEIER

Python’s Magic Methods in Classes

In this video course, you’ll learn what magic methods are in Python, how they work, and how to use them in your custom classes to support powerful features in your object-oriented code.
REAL PYTHON course

[Webinar] How to Build Secure, Ethical, and Scalable AI Operations

As GenAI and LLMs rapidly evolve, the impact of data leaks and unsafe AI outputs makes it critical to secure your AI infrastructure. Learn how MLOps and ML Platform teams can use the newly launched Guardrails Pro to secure AI operations — enabling faster, safer adoption of LLMs at scale →
GUARDRAILS sponsor

Make It Ephemeral: Software Should Decay and Lose Data

In the real world, things decay over time. In the digital world things get kept forever, and sometimes that shouldn’t be so. Designing for deletion is hard.
ARMIN RONACHER

Python 3.13, t-Strings, Dep Groups…

Bite code! does their monthly Python news wrap-up. Check out stories on 3.13, proposed template strings, dependency groups in pyproject.toml, and more.
BITE CODE!

Identifying Products From Images

This project uses computer vision solution to automate doing inventory of products in retail, using YOLOv8 and image embeddings for precise detection.
ALBERT FERRÉ • Shared by Albert Ferré

Write More Pythonic Code With Context Managers

Context managers enable you to create “template” code with initialization and clean up to make the code that uses them easier to read and understand.
JUHA-MATTI SANTALA

Django Girls 10th Birthday!

This post celebrating ten years of Django Girls talks about how it got started, what they’re hoping to do, and how you can get involved.
DJANGO GIRLS

`pytest` Selection Arguments for Failing Tests

This quick TIL post talks about five useful pytest options that let you control what tests to run with respect to failing tests.
RODRIGO GIRÃO SERRÃO

Asyncio `gather()` Return Values

This post shows you how to return values from coroutines that have been concurrently executed using asyncio.gather().
JASON BROWNLEE

PyBay 2024

This list contains the recorded talks from the PyBay 2024 conference.
YOUTUBE video

Projects & Code

wimsey: Data Contract Library

GITHUB.COM/BENRUTTER

libcom: Image Composition Toolbox

GITHUB.COM/BCMI

simplemind: Experimental Client for AI Providers

GITHUB.COM/KENNETHREITZ

PyChrono: Multi-Physics Simulation in Python

CRISTIANOPIZZAMIGLIO.COM • Shared by Cristiano Pizzamiglio

jamesql: In-Memory NoSQL Database in Python

GITHUB.COM/CAPJAMESG

Events

Weekly Real Python Office Hours Q&A (Virtual)

November 6, 2024
REALPYTHON.COM

Canberra Python Meetup

November 7, 2024
MEETUP.COM

Sydney Python User Group (SyPy)

November 7, 2024
SYPY.ORG

DFW Pythoneers 2nd Saturday Teaching Meeting

November 9, 2024
MEETUP.COM

PiterPy Meetup

November 12, 2024
PITERPY.COM

PyCon Sweden 2024

November 14 to November 16, 2024
PYCON.SE

PyCon Hong Kong 2024

November 16 to November 17, 2024
PYCON.HK

PyCon Mini Tokai 2024

November 16 to November 17, 2024
PYCON.JP

PyCon Ireland 2024

November 16 to November 18, 2024
PYTHON.IE

Happy Pythoning!
This was PyCoder’s Weekly Issue #654.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

November 05, 2024 07:30 PM UTC

Real Python

Introduction to Web Scraping With Python

Web scraping is the process of collecting and parsing raw data from the Web, and the Python community has come up with some pretty powerful web scraping tools.

The Internet hosts perhaps the greatest source of information on the planet. Many disciplines, such as data science, business intelligence, and investigative reporting, can benefit enormously from collecting and analyzing data from websites.

In this video course, you’ll learn how to:

Parse website data using string methods and regular expressions
Parse website data using an HTML parser
Interact with forms and other website components

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 05, 2024 02:00 PM UTC

Quiz: Variables in Python: Usage and Best Practices

In this quiz, you’ll test your understanding of Variables in Python: Usage and Best Practices.

By working through this quiz, you’ll revisit how to create and assign values to variables, change a variable’s data type dynamically, use variables to create expressions, counters, accumulators, and Boolean flags, follow best practices for naming variables, and create, access, and use variables in their scopes.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 05, 2024 12:00 PM UTC

Talk Python to Me

#484: From React to a Django+HTMX based stack

Have you heard about HTMX? We've discussed it a time or two on this show. We're back with another episode on HTMX, this time with a real-world success story and lessons learned. We have Sheena O'Connell on to tell us how she moved from a React-Django app to pure Django with HTMX. Episode sponsors <a href='https://talkpython.fm/posit'>Posit</a> <a href='https://talkpython.fm/bluehost'>Bluehost</a> <a href='https://talkpython.fm/training'>Talk Python Courses</a> Links from the show <div>Sheena O'Connell: <a href="https://www.sheenaoc.com?featured_on=talkpython" target="_blank" >sheenaoc.com</a> An HTMX success story essay: <a href="https://www.sheenaoc.com/articles/2024-06-30-htmx?featured_on=talkpython" target="_blank" >sheenaoc.com</a> Sheena's HTMX Workshop: <a href="https://prelude.tech/upcoming_workshops?featured_on=talkpython" target="_blank" >prelude.tech - discount code: talk_python</a> Talk Python's HTMX Courses HTMX + Flask course: <a href="https://training.talkpython.fm/courses/htmx-flask-modern-python-web-apps-hold-the-javascript" target="_blank" >training.talkpython.fm</a> HTMX + Django course: <a href="https://training.talkpython.fm/courses/htmx-django-modern-python-web-apps-hold-the-javascript" target="_blank" >training.talkpython.fm</a> Build An Audio AI App course: <a href="https://training.talkpython.fm/courses/build-an-audio-ai-app-with-python-and-assemblyai" target="_blank" >training.talkpython.fm</a> HTMX: <a href="https://htmx.org?featured_on=talkpython" target="_blank" >htmx.org</a> Playwright: <a href="https://playwright.dev?featured_on=talkpython" target="_blank" >playwright.dev</a> django-template-partials: <a href="https://github.com/carltongibson/django-template-partials?featured_on=talkpython" target="_blank" >github.com</a> Michael's jinja_partials: <a href="https://github.com/mikeckennedy/jinja_partials?featured_on=talkpython" target="_blank" >github.com</a> django-guardian: <a href="https://github.com/django-guardian/django-guardian?featured_on=talkpython" target="_blank" >github.com</a> Talk Python Courses HTMX Example: <a href="https://training.talkpython.fm/courses/all" target="_blank" >training.talkpython.fm/courses/all</a> Alpine.js: <a href="https://alpinejs.dev?featured_on=talkpython" target="_blank" >alpinejs.dev</a> David Guillot SaaS video: <a href="https://www.youtube.com/watch?v=3GObi93tjZI" target="_blank" >youtube.com</a> awesome-htmx: <a href="https://github.com/rajasegar/awesome-htmx?featured_on=talkpython" target="_blank" >github.com</a> Guild of Educators: <a href="https://guildofeducators.org?featured_on=talkpython" target="_blank" >guildofeducators.org</a> The big rewrite song: <a href="https://www.youtube.com/watch?v=xCGu5Z_vaps" target="_blank" >youtube.com</a> Watch this episode on YouTube: <a href="https://www.youtube.com/watch?v=ZTAEkPRNbi4" target="_blank" >youtube.com</a> Episode transcripts: <a href="https://talkpython.fm/episodes/transcript/484/from-react-to-a-django-htmx-based-stack" target="_blank" >talkpython.fm</a> --- Stay in touch with us --- Subscribe to us on YouTube: <a href="https://talkpython.fm/youtube" target="_blank" >youtube.com</a> Follow Talk Python on Mastodon: <a href="https://fosstodon.org/web/@talkpython" target="_blank" >talkpython</a> Follow Michael on Mastodon: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" >mkennedy</a> </div>

November 05, 2024 08:00 AM UTC

Tryton News

Tryton Release 7.4

We are proud to announce the 7.4 release of Tryton .
This release provides many bug fixes, performance improvements and some fine tuning.
You can give it a try on the demo server, use the docker image or download it here.
As usual upgrading from previous series is fully supported.

Here is a list of the most noticeable changes:

Changes for the User

Clients

The Many2Many widget now has a restore button to revert the removal of records before saving.

The CSV export window stays open after the export is done so you can refine your export without having the redo all of the configuration.
It also supports exporting and importing translatable fields with a language per column.
The error messages displayed when there is a problem with the CSV import have been improved to include the row and column number of the value that caused the error.

The management window for the favourites has been removed and replaced by a simple “last favorite first” order.

The focus goes back to the search entry after performing a search/refresh.

You can now close a tab by middle clicking on it (as is common in other software).

Web Client

The left menu and the attachment preview can now be resized so the user can make them the optimal size for their screen.

Accounting

The minimal chart of accounts has been replaced by the a universal chart of accounts which is a good base for IFRS and US GAAP.

It is now possible to copy an accounting move from a closed period. The closed period will be replaced by the current period after accepting the warning.

The payments are now numbered to make it easier to identify them inside the application.
An option has been added to the parties to allow direct debits to be created based on the balance instead of the accounting lines.
We’ve added a button on the Stripe payments and Stripe and Braintree customers to allow an updated to be forced. This helps when fixing missed webhooks.

When a stock move is cancelled, the corresponding stock account move is now cancelled automatically.
But it now no longer possible to cancel a done stock move which has been included in a calculation used for anglo-saxon accounting.

Commission

It is now possible to deactivate an agent so that they are no longer used for future orders.

Company

It is now possible to add a company logo. This is then displayed in the header of generated documents.

Incoterm

A warning is now raised when the incoterm of a shipment is different from the original document (such as the sale or purchase).

Party

We’ve added more identifiers for parties like the United Kingdom Unique Taxpayer Reference, Taiwanese Tax Number, Turkish tax identification number, El Salvador Tax Number, Singapore’s Unique Entity Number, Montenegro Tax Number and Kenya Tax Number.

Product

We’ve added a wizard to manage the replacement of products. Once there is no more stock of the replaced product in any of the warehouses, all the stock on all pending orders are replaced automatically.

A description can now be set for each product image.

There is now a button on the price list form to open the list of lines. This is helpful when the price list has a lot of lines.

Production

It is now possible to cancel a done production. All its stock moves are then cancelled.

The Bill of Materials now have an auto-generated internal code.

Purchase

The wizard to handle exceptions has been improved to clearly display the list of lines to recreate and the list of lines to ignore.

The menu entry Parties associated to Purchases has been removed in favour of the per party reporting.

The purchase amendment now supports amending the quantity of a purchase line using the secondary unit.

Quality

It is now no longer possible to delete non-pending inspections.

Sale

The wizards to handle exceptions have been improved to clearly display the list of lines to recreate and the list of lines to ignore.

The menu entry Parties associated to Sales has been removed in favor of the per party reporting.

A warning is now raised when the user tries to submit a complaint for the same origin as an existing complaint.

The reporting can be grouped per promotion.

From a promotion, it is now possible to list of the sales related to it.
The coupon number of promotion can now be reused once the previous promotion has expired.

The sale amendment now supports amending the quantity of a sale line using the secondary unit.

Stock

It is now possible to cancel a done shipment. When this happens the stock moves of the shipment are cancelled.

The task to reschedule late shipments now includes any shipment that is not yet done.

The supplier shipments no longer have a default planned date.

The customer shipments now have an extra state, Shipped, before the Done state.

The lot trace now shows the inventory as a document.

The package weight and the warehouse are now criteria that can be used when selecting a shipping method.

Changes for the System Administrator

The clients automatically retry 5 times on a 503 Service Unavailable response. They respect the Retry-After value if it is set in the response header. This is useful when performing short maintenance on the server without causing an interruption for the users.

The scheduled tasks now show when they are running and prevent the user from editing them (as they are locked anyway).
We also store their last duration for a month by default. So the administrator can analyze and find slow tasks.

It is now possible to configure a license key for the TinyMCE editor.
Also TinyMCE has been updated to version 7.

It is now possible to configure the command to use to convert a report to a different format. This allows the use of an external service like document-converter.

Accounting

The Accounting Party group has been merged into the *Accounting" group.

We now raise a warning when the user is changing one of the configured credentials used on external services. This is to prevent accidental modification.

Document Incoming

It is now possible to set a maximum size for the content of the document incoming requests.

Inbound Email

It is now possible to set a maximum size for the inbound email requests.

Web Shop

There is now a scheduled task that updates the cache that contains the product data feeds.

Changes for the Developer

Server

The ORM supports SQL Range functions and operators to build exclusion constraints. This allows, for example, the use of non-overlapping constraints using an index.
On PostgreSQL the btree_gist extension may be needed otherwise the ORM will fallback to locking querying the table.
The SQLite backend adds simple SQL constraints to the table schema.

The relational fields with a filter are no longer copied by default. This was a frequent source of bugs as the same relational field without the filter was already copied so it generated duplicates.

We’ve added a sparkline tool to generate textual sparklines. This allows the removal of the pygal dependency.

The activate_modules from testing now accepts a list of setup methods that are run before taking the backup. This speeds up any other tests which restore the backup as they then do not need to run those setup methods.

The backend now has a method to estimate the number of rows in a table. This is faster than counting when we only need an estimate, for example when choosing between a join and a sub-query.

We’ve added a ModelSQL.__setup_indexes__ method that prepares the indexes once the Pool has been loaded.

It is now possible to generate many sequential numbers in a single call. This allows, for example, to number a group of invoices with a single call.

The backend now uses JSONB by default for MultiSelection fields. It was already supported, but the database needed to be altered to activate the feature.

You can now define the cardinality (low, normal or high) for the index usage. This allows the backend to choose an optimal type of index to create.

We now have tools that apply the typing to columns of an SQLite query. This is needed because SQLite doesn’t do a good job of supporting CAST.

The RPC responses are now compressed if their size if large enough and the client accepts it.

The ModelView._changed_values and ModelStorage._save_values are now methods instead of properties. This makes it is easier to debug errors because AttributeError exceptions are no longer hidden.

The scheduled task runner now uses a pool of processes for better parallelism and management. Only the running task is now locked.

We’ve added an environment variable TEST_NETWORK so we can avoid running tests that require network access.

There is now a command line option for exporting translations and storing them as a po file in the corresponding module.
Tryton sets the python-format flag in the po file for the translations containing python formats. This allows Weblate (our translation service) to check if the translations keep the right placeholders.

Accounting

The payment amounts are now cached on the account move line to improve the performance when searching for lines to pay.
The payment amounts now have to be greater or equal to zero.

Purchase

Only purchase lines of type line can be used as an origin for a stock move.

Sale

Only sales lines of type line can be used as an origin for a stock move.

The fields from the Sale Shipment Cost Module are now all prefixed with sale_.

Stock

Cancelled moves are no longer included in the shipment and package measurements.

2 posts - 1 participant

Read full topic

November 05, 2024 07:00 AM UTC

Django Weblog

Django bugfix release issued: 5.1.3

Today we've issued the 5.1.3 bugfix release.

The release package and checksums are available from our downloads page, as well as from the Python Package Index. The PGP key ID used for this release is Mariusz Felisiak: 2EF56372BA48CD1B.

November 05, 2024 06:04 AM UTC

November 04, 2024

James Bennett

Three Django wishes

’Tis the season when people are posting their “Django wishlists”, for specific technical or organizational or community initiatives they’d like to see undertaken. Here are a few examples from around the Django community:

So, in the spirit of the season, here is my own list, which I’ve narrowed down to three wishes (in the tradition of many stories about wishes), consisting of one organizational item and two technical ones.

Pass the torch

This one requires a bit of background, so please bear with me.

The Django Software Foundation — usually just abbreviated “DSF” — is the nonprofit organization which officially “owns” Django. It’s the legal holder of all the intellectual property, including both the copyright to the original Django codebase (generously donated by the Lawrence Journal-World, where it was first developed) and the trademarks, such as the registered trademark on the name “Django” itself. The DSF does a lot (and could do more, with a bigger budget) to support the Django community, and offers financial support to the development of Django itself, but does not directly develop Django, or oversee Django’s development or technical direction.

Originally, that job went to Django co-creators Adrian Holovaty and Jacob Kaplan-Moss. They granted commit permissions to a growing team of collaborators, but remained the technical leaders of the Django project until 2014, when they stepped aside and a new body, called the Technical Board, was introduced to replace them. The Technical Board was elected by the Django committers — by this point usually referred to as “Django Core” — and although the committers continued to have broad authority to make additions or changes to Django’s codebase, the Technical Board became the ultimate decision-maker for things that needed a tie-breaking vote, or that were too large for a single committer to do solo (usually via Django Enhancement Proposals, or DEPs, modeled on the processes of many other open-source projects, including Python’s “PEPs”).

One thing the DSF has done is use some of its funds on the Django Fellowship program, which pays contractors (the “Django Fellows”) to carry out tasks like ticket triage, pull-request review, etc. which would otherwise rely on volunteer labor (with all the problems that involves).

But the system of “Django Core” committers and Technical Board did not work out especially well. Many of the committers were either intermittently active or just completely inactive, new committers were added rarely if ever, and it was unclear what sort of path there was (or even if there was a path) for a motivated contributor to work their way toward committer status. About the only thing that did work well was the Fellowship program, which largely was what kept Django running as a software project toward the end of that era.

This caused a lot of debates focused on the theme of what to do about “Django Core” and how to reform the project and get it back on a healthy footing. The end result of that was a Django Enhancement Proposal numbered as DEP 10, which I spent most of 2018 and 2019 working on. I wrote an explanation at the time, and I’ll just link it here and mention that DEP 10 (which passed in early 2020) kept the Technical Board as a tie-breaking and oversight body, and introduced two other main roles — “Mergers” and “Releasers” — which have mostly but not exclusively been filled by the Django Fellows. The first DEP 10 Technical Board drafted and passed another DEP, DEP 12, renaming themselves to “Steering Council” (similar to Python’s technical governing body, but a name I’ve never liked because the Django version doesn’t meaningfully “steer” Django) and making a few tweaks.

So, that brings us to the present day. Where, sadly, the DEP 10/12 era is looking like as much of a failure as the preceding “Django Core” + committer-elected Technical Board era. The DEP 10 Technical Boards/Steering Councils have been dysfunctional at best, and there’s been no influx of new people from outside the former “Django Core”. A stark example: I ran for the the Steering Council last year to try to work on fixing some of this, but the Steering Council election attracted only four total candidates for five seats, all of them former “Django Core” members.

Recently there was a lot of discussion on the DSF members’ forum about what to do with the Steering Council, and a few attempts to take action which failed in frustrating ways. The end result was the resignation of two Steering Council members, which brought the group below quorum and has automatically triggered an election (though one that will run under the existing DEP 10/12 rules, since triggering an election locks the eligibility and election rules against changes).

I believe the ongoing inability to develop stable technical governance and meaningful turnover of technical leadership is the single greatest threat to Django’s continued viability as a project. This is an unfortunate vindication of what I said six years ago in that blog post about developing DEP 10:

Django’s at risk of being badly off in the future; for some time now, the project has not managed to bring in new committers at a sufficient rate to replace those who’ve become less active or even entirely inactive, and that’s not sustainable for much longer.

The good news is there’s a new generation of contributors who I believe are more than ready to take up the technical leadership of Django, and even a structured program — not run by former “Django Core”! — for recruiting and mentoring new contributors on an ongoing basis and helping them build familiarity with working on and contributing to Django. The bad news is there’s a huge obstacle in their way: all of us old-time “Django Core” folks who keep occupying all the official leadership positions. Just recruiting people to run against such long-time well-known names in the project is difficult, and actually winning against us probably close to impossible.

So the biggest thing I’d like for Django, right now, is for the entire former “Django Core” group — myself included! — to simply get out of the way. I thought I could come back last year and help fix things after stepping down post-DEP-10, but doing so was a mistake and only prolonged the problem. I will not be running in the upcoming Steering Council election and I beg my “Django Core” colleagues to all do likewise. There are qualified, motivated folks out there who should be given their chance to step up and run things, and we should collectively place Django into their capable hands. Then they can sort out the rest of the technical governance however they see fit.

And honestly, I’ve been in and out of just about every formal role the Django project has for (checks calendar) seventeen years now. It’s time. It’s time for me, and the rest of the old guard, to give way to new folks before we do serious harm to Django by continuing to hold on to leadership roles.

Give Django a hint

Python 3.0 introduced the ability to add “annotations” to function and method declarations, and though it didn’t specify what they were to be used for, people almost immediately started developing ways to specify static type information via annotations, which came to be known as “type hints”. Python 3.5 formalized this and introduced the typing module in the standard library with tools to make the type-hint use case easier, and Python 3.6 introduced the ability to annotate other names, including standalone variables and class attributes.

Django has a complicated history with this feature of modern Python. There’ve been multiple efforts to add type annotations directly in Django’s own code, there’s a third-party package which provides annotations as an add-on, a proposed DEP never went anywhere because the Technical Board at the time was against it, and it’s just been stuck as a frequently-requested feature ever since.

Let me be absolutely clear: I don’t have any issue with statically-typed programming languages as a concept. I’ve used both statically- and dynamically-typed languages and liked and disliked examples of each. If I weren’t writing Python, personally I probably would be writing C# (statically-typed). But I also have absolutely no interest in static type checking for Python as a feature or a use case.

What I do have an interest in is all the other use cases type hints enable. There’s a whole booming ecosystem of modern Python tools out there now which use type hints to enable all sorts of interesting runtime behavior. Pydantic and msgspec do runtime derivation of validation and serialization/deserialization behavior from type hints. FastAPI and Litestar are web frameworks which use type hints to drive input/output schemas, dependency injection and more. SQLAlchemy as of version 2.0 can use type hints to drive ORM class definitions.

I am very interested in those sorts of things, and right now they’re not available from vanilla Django because Django doesn’t do type hints (you can use a third-party package to turn Django into something resembling one of the newer type-hint-driven frameworks, but it’s an extra package and a whole new way of doing things that doesn’t “feel like Django”).

Compare, for example, this Django ORM model:

from django.db import models

class Person(models.Model):
    name = models.CharField()
    date_of_birth = models.DateField()

With its modern SQLAlchemy equivalent:

from datetime import date   
from sqlalchemy.orm import DeclarativeBase, Mapped

Base = DeclarativeBase()

class Person(Base):
    __tablename__ = "person"
    name: Mapped[str]
    date_of_birth: Mapped[date]

You can use SQLAlchemy’s mapped_column() function to be more verbose and specify a bunch more information, but for a basic column you don’t have to. Just write a type hint and it does the right thing.

I think type hint support in Django has the potential to unlock a huge variety of useful new features and conveniences, and the lack of it is causing Django to fall well behind the current state of the art in Python web development. So if I could somehow wave a magic wand and get any single technical change instantly made to Django, type hints would be it.

More generic Django

Django includes a feature known as “generic views” (keep in mind that Django doesn’t strictly follow regular MVC terminology, and so a Django “view” is what most pure MVC implementations would call the “controller”), which are reusable implementations of common operations like “CRUD” (create, retrieve, update, delete operations — including both individual-result and list-of-results), date-based archives of data, etc.

And basically everybody agrees Django’s generic views are too complicated. There’s a giant complex inheritance tree of classes involved, with a huge mess of different attributes and methods you can set or override to affect behavior depending on exactly which set of classes you’re inheriting from, creating a steep learning curve and requiring even experienced developers to spend a lot of time with both official and unofficial documentation (ccbv.co.uk is the usual reference people are directed to).

There’s a reason for this complexity: originally, Django’s generic views were functions, not classes, and you customized their behavior by passing arguments to them. The class-based generic views were introduced in Django 1.3 (released in 2011), and for compatibility and ease of migration at the time, were implemented in a way which precisely mirrored the functionality of the function-based views. Which means that for every thing you could do via an argument to the function-based views, there is a mixin class, method, or attribute on the class-based ones corresponding to it.

This made some sense at the time, because it was a big migration to ask people to go through. It makes much less sense now, over 13 years later, when the complexity of Django’s hierarchy of class-based views mostly just scares people and makes them not want to use what is otherwise a pretty useful feature: class-based views are a huge reduction in repetitive/boilerplate code when you know how to use them (for example, see the views used by this site for date-based browsing of entries and detail/list views of entries by category — that really is all the code needed to provide all the backend logic).

At this point the overcomplexity of Django’s generic views is basically a meme in the community, and is one of the things I see most often cited by new Django users as making their experience difficult. So if I were going to be given the magic wand a second time and allowed to make another instant technical change, it’d be to finally deprecate the complicated generic-view class hierarchy and replace it with a ground-up rewrite aimed at providing a clear, powerful API rather than maintaining compatibility with a set of older functions that were deprecated nearly a decade and a half ago.

What do you wish for?

Of course, there’s a lot more that could be done to or for Django besides the three items I’ve outlined here. I’d encourage anyone who uses Django to think about what they’d like to see, to post about it, and, ideally, to get involved with Django’s development. That’s not just a way to get bits of your own wishlist implemented; it’s also the way to make sure Django continues to be around for people to have wishes about, and I hope that continues for many years to come.

November 04, 2024 11:21 PM UTC

Python Engineering at Microsoft

Announcing GitHub Copilot in Data Wrangler

AI did not write this blog post, but it will make your exploratory data analysis with Data Wrangler better!

Today, we’re excited to introduce our first step of integrating the power of Copilot into Data Wrangler.

With this first integration of Copilot with Data Wrangler, you’ll be able to:

Use natural language to clean and transform your data
Get help with fixing errors in your data transformation code

An example of using Copilot in Data Wrangler to filter for listings that allow dogs/cats

A common limitation of using AI tools for exploratory data analysis tasks today is the lack of data context provided to the AI. Responses are typically more generalized and not tailored to the specific task or data at hand. In addition, there’s always the manual and tedious task of verifying the correctness of the generated code.

What makes Copilot with Data Wrangler different is twofold. First, this integration allows you to choose to provide Copilot with your data context, enabling it to generate more relevant and specific code for the exact dataset you have open. Second, you get to preview the exact behavior of the code on your dataset with the Data Wrangler interface to visually validate Copilot’s response, along with all the benefits that the Data Wrangler tool provides.

Data transformations

With Copilot in Data Wrangler, you can ask it to perform ambiguous, open-ended transformations or a specific task you have in mind. Below we’ve included three examples of the many possibilities you can achieve with Copilot in Data Wrangler:

Formatting a datetime column

Removing any column(s) with over 40% missing values

Fixing an error in a data transformation

Getting started today

To use Copilot with Data Wrangler, you will need the following 3 prerequisites.

You must have the Data Wrangler extension for VS Code installed.
You must have the GitHub Copilot extension for VS Code installed.
You must have an active subscription for GitHub Copilot in your personal account, or you need to be assigned a seat by your organization. Sign up for a GitHub Copilot free trial in your personal account.

Follow these steps to Set up GitHub Copilot in VS Code.

Once the prerequisites are met, you will see the Copilot interface within Data Wrangler by default (customizable in the Data Wrangler settings) when you are in Editing Mode. You can then either select the input box or use the default Copilot keyboard shortcut of CMD/CTRL + I.

Responsible AI

AI is not perfect (neither are we!) and it will improve over time. Microsoft and GitHub Copilot follow Responsible AI principles and employ controls to ensure that your experience with the service is appropriate, pleasant, and useful. We understand there is hesitation and concern surrounding the rapid expansion of AI’s capabilities, and fully respect those who don’t want or can’t use Copilot.

If you have any feedback around the Copilot experience in Data Wrangler, please file an issue in our Data Wrangler public GitHub repository here.

Next Steps

We are just getting started. This is the first experience in Data Wrangler that we are enhancing with Copilot. Stay tuned for more AI-powered experiences in Data Wrangler to help with your data analysis needs soon!

The post Announcing GitHub Copilot in Data Wrangler appeared first on Python.

November 04, 2024 07:02 PM UTC

Real Python

Variables in Python: Usage and Best Practices

In Python, variables are symbolic names that refer to objects or values stored in your computer’s memory. They allow you to assign descriptive names to data, making it easier to manipulate and reuse values throughout your code.

Understanding variables is key for Python developers because variables are essential building blocks for any Python program. Proper use of variables allows you to write clear, readable, and maintainable code.

In this tutorial, you’ll learn how to:

Create and assign values to variables
Change a variable’s data type dynamically
Use variables to create expressions, counters, accumulators, and Boolean flags
Follow best practices for naming variables
Create, access, and use variables in their scopes

To get the most out of this tutorial, you should be familiar with Python’s basic data types and have a general understanding of programming concepts like loops and functions.

Don’t worry if you don’t have all this knowledge yet and you’re just getting started. You won’t need this knowledge to benefit from working through the early sections of this tutorial.

Get Your Code: Click here to download the free sample code that shows you how to use variables in Python.

Take the Quiz: Test your knowledge with our interactive “Variables in Python: Usage and Best Practices” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Variables in Python: Usage and Best Practices

In this quiz, you'll test your understanding of variables in Python. Variables are symbolic names that refer to objects or values stored in your computer's memory, and they're essential building blocks for any Python program.

Getting to Know Variables in Python

In Python, variables are names associated with concrete objects or values stored in your computer’s memory. By associating a variable with a value, you can refer to the value using a descriptive name and reuse it as many times as needed in your code.

Variables behave as if they were the value they refer to. To use variables in your code, you first need to learn how to create them, which is pretty straightforward in Python.

Creating Variables With Assignments

The primary way to create a variable in Python is to assign it a value using the assignment operator and the following syntax:

Python Syntax
      
    
variable_name = value
Copied!

In this syntax, you have the variable’s name on the left, then the assignment (=) operator, followed by the value you want to assign to the variable at hand. The value in this construct can be any Python object, including strings, numbers, lists, dictionaries, or even custom objects.

Note: To learn more about assignments, check out Python’s Assignment Operator: Write Robust Assignments.

Here are a few examples of variables:

Python
      
>>> word = "Python"

>>> number = 42

>>> coefficient = 2.87

>>> fruits = ["apple", "mango", "grape"]

>>> ordinals = {1: "first", 2: "second", 3: "third"}

>>> class SomeCustomClass: pass
>>> instance = SomeCustomClass()
Copied!

In this code, you’ve defined several variables by assigning values to names. The first five examples include variables that refer to different built-in types. The last example shows that variables can also refer to custom objects like an instance of your SomeCustomClass class.

Setting and Changing a Variable’s Data Type

Apart from a variable’s value, it’s also important to consider the data type of the value. When you think about a variable’s type, you’re considering whether the variable refers to a string, integer, floating-point number, list, tuple, dictionary, custom object, or another data type.

Python is a dynamically typed language, which means that variable types are determined and checked at runtime rather than during compilation. Because of this, you don’t need to specify a variable’s type when you’re creating the variable. Python will infer a variable’s type from the assigned object.

Note: In Python, variables themselves don’t have data types. Instead, the objects that variables reference have types.

For example, consider the following variables:

Python
      
        
      
    
>>> name = "Jane Doe"
>>> age = 19
>>> subjects = ["Math", "English", "Physics", "Chemistry"]

>>> type(name)
<class 'str'>
>>> type(age)
<class 'int'>
>>> type(subjects)
<class 'list'>
Copied!

Read the full article at https://realpython.com/python-variables/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

November 04, 2024 02:00 PM UTC

Robin Wilson

Join the GeoTAM hackathon to work out business turnovers!

Summary: I’m involved in organising a hackathon, and I’d love you to take part. The open-source GeoTAM hackathon focuses on estimating turnover for individual business locations in the UK, from a variety of open datasets. Please checkout the hackathon page and sign up. There are prizes of up to £2,000!

(Click image for a larger version)

I’m currently working with Rebalance Earth, a boutique asset manager who are focused on making nature an investable asset. Our aim is to mobilise investment in UK natural infrastructure – for example, by arranging investment to undertake river restoration and reduce the risk of flooding. We will do this by finding businesses at risk of flooding, designing restoration schemes that will reduce this risk, and setting up ‘Nature-as-a-Service’ contracts with businesses to pay for the restoration.

I’m the Lead Geospatial Developer at Rebalance Earth, and am leading the development of our Geospatial Predictive Analytics Platform (GPAP), which helps us assess businesses at risk of flooding and design schemes to reduce this flooding.

An important part of deciding which areas to focus on is estimating the total business value at risk from flooding. A good way of establishing this is to use an estimate of the business turnover. However, there are no openly-available datasets showing business turnover in the UK – which is where the hackathon comes in.

We’re looking for participants to bring their expertise in programming, data science, machine learning and more to take some datasets we provide, combine them with other open data and try and estimate turnover. Specifically, we’re interested in turnover of individual business locations – for example, the turnover of a specific supermarket, not the whole supermarket chain.

The hackathon runs from 20th – 26th November 2024. We’ll provide some datasets, some ideas, and a Discord server to communicate through. We’d like you to bring your expertise and see what you can produce. This is a tricky task, and we’re not expecting fully polished solutions; proof-of-concept solutions are absolutely fine. You can enter as a team or an individual.

Most importantly, there are prizes:

£2,000 for the First Prize
£1,000 for the Second Prize
£500 for the Third Prize

and there’s a possibility that we might even hire you to continue work on your idea!

So, please sign up and tell your friends!

November 04, 2024 11:04 AM UTC

ListenData

How to Automate WordPress using Python

This tutorial explains how to use Python to automate tasks in WordPress. It includes various functions to perform tasks such as creating, extracting, updating and deleting WordPress posts, pages, comments and media items (images) directly from Python.

Table of Contents

To read this article in full, please click here

November 04, 2024 08:29 AM UTC

Python Bytes

#408 python-preference only-managed 3.13t

Topics covered in this episode: <ul> <li><a href="https://nedbatchelder.com/blog/202410/github_action_security_zizmor.html?featured_on=pythonbytes">GitHub action security: zizmor</a></li> <li><a href="https://github.blog/news-insights/octoverse/octoverse-2024/?featured_on=pythonbytes">Python is now the top language on GitHub</a></li> <li><a href="https://www.bitecode.dev/p/python-313-what-didnt-make-the-headlines?featured_on=pythonbytes">Python 3.13, what didn't make the headlines</a></li> <li><a href="https://us.pycon.org/2025/?featured_on=pythonbytes">PyCon US 2025</a></li> <li>Extras</li> <li>Joke</li> </ul><a href='https://www.youtube.com/watch?v=9pyPp5lLSfI' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="408">Watch on YouTube</a> About the show Sponsored by: <ul> <li><a href="https://pythonbytes.fm/scout">ScoutAPM</a> - Django Application Performance Monitoring</li> <li><a href="https://pythonbytes.fm/codeium">Codeium</a> - Free AI Code Completion & Chat </li> </ul> Connect with the hosts <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a></li> </ul> Join us on YouTube at <a href="https://pythonbytes.fm/stream/live">pythonbytes.fm/live</a> to be part of the audience. Usually Monday at 10am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it. Brian #1: <a href="https://nedbatchelder.com/blog/202410/github_action_security_zizmor.html?featured_on=pythonbytes">GitHub action security: zizmor</a> <ul> <li>Article: Ned Batchelder </li> <li>zizmor: William Woodruff & others</li> <li>“a new tool to check your GitHub action workflows for security concerns.”</li> <li>Install with cargo or brew, then point it at workflow yml files.</li> <li>It reports security concerns.</li> </ul> Michael #2: <a href="https://github.blog/news-insights/octoverse/octoverse-2024/?featured_on=pythonbytes">Python is now the top language on GitHub</a> <ul> <li>Thanks to Pat Decker for the heads up.</li> <li>A rapidly growing number of developers worldwide <ul> <li>This suggests AI isn’t just helping more people learn to write code or build software faster—it’s also attracting and helping more people become developers. First-time open source contributors continue to show wide-scale interest in AI projects. But we aren’t seeing signs that AI has hurt open source with low-quality contributions.</li> </ul></li> <li>Python is now the most used language on GitHub as global open source activity continues to extend beyond traditional software development. <ul> <li>The rise in Python usage correlates with large communities of people joining the open source community from across the STEM world rather than the traditional community of software developers.</li> </ul></li> <li>There’s a continued increase in first-time contributors to open source projects. 1.4 million new developers globally joined open source with a majority contributing to commercially backed and generative AI projects. <ul> <li>Notably, we did not see a rise in rejected pull requests. This could indicate that quality remains high despite the influx of new contributors.</li> </ul></li> </ul> Brian #3: <a href="https://www.bitecode.dev/p/python-313-what-didnt-make-the-headlines?featured_on=pythonbytes">Python 3.13, what didn't make the headlines</a> <ul> <li>Some pretty cool updates to pdb : the command line Python debugger <ul> <li>multiline editing</li> <li>code completion </li> </ul></li> <li>pathlib has a bunch of performance updates</li> <li>python -m venv adds a .gitignore file that auto ignores the venv.</li> </ul> Michael #4: <a href="https://us.pycon.org/2025/?featured_on=pythonbytes">PyCon US 2025</a> <ul> <li>Site is live with CFP and dates</li> <li><a href="https://us.pycon.org/2025/about/health-safety-guidelines/?featured_on=pythonbytes">Health code</a> is finally reasonable: “Masks are Encouraged but not Required”</li> <li>PyCon US 2025 Dates <ul> <li>Tutorials - May 14-15, 2025</li> <li>Sponsor Presentations - May 15, 2025</li> <li>Opening Reception - May 15, 2025</li> <li>Main Conference and Online - May 16-18, 2025</li> <li>Job Fair - May 18, 2025</li> <li>Sprints - May 19-May 22, 2025</li> </ul></li> </ul> Extras Brian: <ul> <li><a href="https://micro.webology.dev/2024/11/02/please-publish-and.html?featured_on=pythonbytes">Please publish and share more</a> - Jeff Triplett</li> </ul> Michael: <ul> <li><a href="https://github.com/tox-dev/pre-commit-uv?featured_on=pythonbytes">pre-commit-uv</a> <ul> <li>Just spoke with Sefanie Molin <a href="https://talkpython.fm/episodes/show/482/pre-commit-hooks-for-python-devs?featured_on=pythonbytes">about pre-commit hooks on Talk Python</a></li> </ul></li> <li><a href="https://blog.omnivore.app/p/omnivore-is-joining-elevenlabs?featured_on=pythonbytes">Curse you Omnivore</a>!</li> <li>We have moved to <a href="https://www.hetzner.com/cloud/?featured_on=pythonbytes">hetzner</a> </li> <li><a href="https://typora.io?featured_on=pythonbytes">Typora markdown app</a></li> <li>free-threaded Python is now available via <a href="https://docs.astral.sh/uv?featured_on=pythonbytes">uv</a> <pre><code>uv self update uv python install --python-preference only-managed 3.13t </code></pre></li> </ul> Joke: <a href="https://devhumor.com/media/coding-chair-vs-debugging-chair?featured_on=pythonbytes">Debugging char</a>

November 04, 2024 08:00 AM UTC

Planet Python

November 08, 2024

What is this pattern about?

What Does the Code Do?

Key Concepts

1. Function Composition with compose

2. Reading HTML Content

3. Parsing the HTML Table

4. Extracting Rows and Data

5. Converting Rows to DataFrame

6. Assigning Correct Data Types

7. Putting It All Together

Conclusion

November 07, 2024

Grants Workgroup Charter Updates

What’s next?

Something wonderful, bringing more changes

Grants Workgroup Charter update process

November 06, 2024

November 05, 2024

Discussions

Articles & Tutorials

Projects & Code

Events

Changes for the User

Clients

Web Client

Accounting

Commission

Company

Incoterm

Party

Product

Production

Purchase

Quality

Sale

Stock

Changes for the System Administrator

Accounting

Document Incoming

Inbound Email

Web Shop

Changes for the Developer

Server

Accounting

Purchase

Sale

Stock

November 04, 2024

Pass the torch

Give Django a hint

More generic Django

What do you wish for?

Data transformations

Getting started today

Responsible AI

Next Steps

Getting to Know Variables in Python

Creating Variables With Assignments

Setting and Changing a Variable’s Data Type

1. Function Composition with `compose`