用python制作一个简单html压缩

简介

举洪荒之力，集天地精华，亲自操刀，用python玩一玩。
真是，虽有嘉肴，弗食，不知其旨也；虽有至道，弗学，不知其善也。
这个压缩很low，并没有什么留掰的，非常简单，可以说只比压缩前小一点点而已，
此乃缺点，优点是不会出错！！！

流程

思路

其实就是去回车而已，如果可以就加上去空格，不过去空格要定为两个才能去，不然标签会出错。因为很low，所以js和css不压缩。
程序载入→遍历目录文件→逐个压缩

实现

个人比较懒，直接饮用网上的某函数

import os

def getFiles(dir, suffix): # 查找根目录，文件后缀 
    res = []
    for root, directory, files in os.walk(dir):  # =>当前根,根下目录,目录下的文件
        for filename in files:
            name, suf = os.path.splitext(filename) # =>文件名,文件后缀
            if suf == suffix:
                res.append(os.path.join(root, filename)) # =>吧一串字符串组合成路径
    return res

for file in getFiles("./", '.py'):  # =>查找以.py结尾的文件
    print(file)

去注释，改文件名，在用replace替换使路径可用，加上判断是否要这个路径，修改后如下

修改函数

import os

def getFiles(dir, suffix):
    res = []
    for root, directory, files in os.walk(dir):
        for filename in files:
            name, suf = os.path.splitext(filename)
            if suf == suffix:
                res.append(os.path.join(root, filename))
    return res

for file in getFiles(r"html所在文件夹路径", '.html'):
    if file == r"D:\blog\public\404.html":
        continue
    elif:
        pass
        
    path = file.replace("\\","\\\\")
    print(path)

定义处理函数

def delete(string):
    res = string.replace("\n","").replace("  ","")
    return res

流程

text_list = []
with open(path,"r",encoding="UTF-8") as f:
    
    Not_Change = False
    for each in f.readlines():
        long += len(each)

        if "<script" in each and "</script>" in each:
            text_list.append(each)
            continue

        if "<style" in each and "</style>" in each:
            text_list.append(each)
            continue
            
        if "<script" in each:
            Not_Change = True
            text_list.append(each)
            continue

        if "</script>" in each:
            Not_Change = False
            text_list.append(delete(each))
            continue

        if "<style" in each:
            Not_Change = True
            text_list.append(each)
            continue

        if "</style>" in each:
            Not_Change = False
            text_list.append(delete(each))
            continue

        
        if Not_Change:
            text_list.append(each)
        else:
            text_list.append(delete(each))

with open(path,"w",encoding="UTF-8") as f:
    for each in text_list:
        short += len(each)
        f.write(each)

print(file + "   压缩完成！一共节省了" + str(long-short) + "个字符！")

END

简直是绞尽脑汁，哈哈哈！压缩这玩意，还行！
最后附上完整代码！

import os

total = 0

def getFiles(dir, suffix):
    res = []
    for root, directory, files in os.walk(dir):
        for filename in files:
            name, suf = os.path.splitext(filename)
            if suf == suffix:
                res.append(os.path.join(root, filename))
    return res

def delete(string):
    res = string.replace("\n","").replace("  ","")
    return res

for file in getFiles(r"D:\blog\public", '.html'):
    if file == r"D:\blog\public\404.html":
        continue
    elif True:
        pass
    
    path = file.replace("\\","\\\\")

    long = 0
    short = 0
    
    text_list = []
    with open(path,"r",encoding="UTF-8") as f:
        
        Not_Change = False
        for each in f.readlines():
            long += len(each)

            if "<script" in each and "</script>" in each:
                text_list.append(each)
                continue

            if "<style" in each and "</style>" in each:
                text_list.append(each)
                continue
                
            if "<script" in each:
                Not_Change = True
                text_list.append(each)
                continue

            if "</script>" in each:
                Not_Change = False
                text_list.append(delete(each))
                continue

            if "<style" in each:
                Not_Change = True
                text_list.append(each)
                continue

            if "</style>" in each:
                Not_Change = False
                text_list.append(delete(each))
                continue

            
            if Not_Change:
                text_list.append(each)
            else:
                text_list.append(delete(each))

    with open(path,"w",encoding="UTF-8") as f:
        for each in text_list:
            short += len(each)
            f.write(each)

    print(file + "   压缩完成！一共节省了" + str(long-short) + "个字符！")
    total += long-short


print("本次压缩共节省了%s个字符" %total)
input("回车退出！")

BUG

经测试，用这个压缩很容易把代码框的缩进一起去掉。而缩进是python的灵魂……
解决方法: 再加个代码框识别或放弃空格压缩。放弃空格压缩可直接在delete函数里的.replace(“ “,””)去掉。