Artistic Style Developer Information

Calling Artistic Style from a Python Script

 

Artistic style can be called from a Python script as a shared library (DLL). There are two Python scripts in the download along with the test files. The scripts will run on either Python 2, Python 3, or Iron Python.

The scripts, ExampleByte.py and ExampleUnicode.py, use Artistic Style as a shared library. The source code is read into the program. A shared library must be in the same folder as the Python scripts. The two scripts have different ways of handling the Python 3 Unicode. The following example is ExampleUnicode.py which uses Unicode strings and a shared library.

Compile Options

To compile Artistic Style as a shared library for use with a Python script the compile option ASTYLE_LIB must be defined. Then it will accept the files and options as parameters from a function call instead of the command line. It is the responsibility of the calling program to read the source files and accept the options from the user via a graphical interface or other method. These are then passed via the function call described below. After the source files are formatted they will be returned to the calling program, which must then save the source file and take other appropriate action.

Handling Python 3 Unicode

Python 3 and Iron Python use Unicode strings instead of byte strings like Python 2. This can be handled a couple of ways when processing disk files and sending them to Artistic Style.

One way is to declare the file as binary. This will read the file as a byte string instead of a Unicode string. In this case, you don't need to be concerned with file encoding and the file can be saved or sent to Artistic Style without decoding. The disadvantage of doing this, is there are byte strings that you have to keep track of and the strings not Unicode. The Python script ExampleByte.py uses this method. The names of byte strings contain the word "byte" so they can be easily identified.

Another way is to use the version 3 default and read the file as a Unicode string. The file is read using the default system encoding. If a file uses a different encoding, it can be decoded using the Python "codecs" class (e.g. codecs.open()). Unicode strings will have to be encoded to be sent to Artistic Style or to be saved. Files sent to Artistic Style can always be encoded with UTF-8. The advantage of this method is you can work with the file in Unicode. The disadvantage is the encoding and decoding that has to be done and the exceptions that should be handled. The file ExampleUnicode.py uses this method. This is the method used in the following example and that is described in the following discussion.

AStyleMain Function Call

The function format_source_code is called to format the source code.

Calling the Artistic Style shared library or DLL requires using the Python ctypes foreign function library. It provides C compatible data types and allows calling functions in a shared library or DLL. All of the text sent to Artistic Style must be in byte format, not Unicode. The entire function is shown so you can see the setup for calling Artistic Style.

Syntax

def format_source_code(libc, text_in, options):
    """ Format the text_in by calling the AStyle shared object (DLL).
        The variable text_in is expected to be a byte string.
        The return value is a byte string.
        If an error occurs, the return value is a NoneType object.
    """
    astyle_main = libc.AStyleMain
    astyle_main.restype = c_char_p
    formatted_text = astyle_main(text_in,
                                 options,
                                 ERROR_HANDLER,
                                 MEMORY_ALLOCATION)
    return formatted_text

Parameters

text_in
A byte string containing the source file to be formatted. For Python 3 it is encoded to UTF-8. This is a Unicode byte format that can be processed by Artistic Style. Since UTF-8 is a Unicode encoding no errors should occur.

options
A byte string containing the formatting options. They should be in the same format as in the default options file. The options may be set apart by new-lines, commas, tabs or spaces. The long options do not need the "--" prefix. Comments may be used, but they must be terminated by a new-line "\n" character.

If the file is not a C/C++ file, the file mode option "mode=java" or "mode=cs" must be included. Otherwise, the default mode of C/C++ is used. For Python 3 the options are explicitly declared as byte format with a "b" preceding the string.

ERROR_HANDLER
A pointer to the error handling function. This function is called if there are errors in the text_in or options. This function and the callback declaration are described below.

MEMORY_ALLOCATION
A pointer to the memory allocation function. The memory must be allocated for the output source file. This function will be called when the memory is needed. This function and the callback declaration are described below.

Return Value

The return type is declared as c_char_p before the function call.

If the function succeeds, the return value is a byte string containing the formatted source code.

If the function fails, the return value is a "NoneType" object. Before the object is returned, an error message will be sent to the error handling function.

This function typically fails for one of the following reasons:

The function will NOT fail for an invalid option in the formatting options. In this case an error message is sent to the error handling function and the formatted source code is returned without using the invalid option.

Remarks

The calling program is responsible for freeing the memory allocated by the MEMORY_ALLOCATION function when it is no longer needed. In the example, this will be done by the MEMORY_ALLOCATION function.

Callback Functions

Two function pointers must be defined for the call to AStyleMain.

Memory Allocation Callback

# AStyle Memory Allocation Callback

# global memory allocation returned to artistic style
# must be global for CPython, not a function attribute
# did not try a class attribute but it may not work for CPython
# IronPython doesn't need global, but it doesn't hurt
ALLOCATED = c_char_p

def memory_allocation(size):
    """ AStyle callback memory allocation.
        The size to allocate is always byte type.
        The allocated memory MUST BE FREED by the calling function.
    """
    # ctypes are different for CPython and IronPython
    global ALLOCATED
    # ctypes for IronPython do NOT seem to be mutable
    # using ctype variables in IronPython results in a
    # "System.AccessViolationException: Attempted to read or write protected memory"
    # IronPython must use create_string_buffer()
    if __is_iron_python__:
        ALLOCATED = create_string_buffer(size)
        return ALLOCATED
    # ctypes for CPython ARE mutable and can be used for input
    # using create_string_buffer() in CPython results in a
    # "TypeError: string or integer address expected instead of c_char_Array"
    # CPython must use c_char_Array object
    else:
        arr_type = c_char* size     # create a c_char array
        ALLOCATED = arr_type()      # create an array object
        return addressof(ALLOCATED)

# global to create the memory allocation callback function
if os.name == "nt":
    MEMORY_ALLOCATION_CALLBACK = WINFUNCTYPE(c_char_p, c_ulong)
else:
    MEMORY_ALLOCATION_CALLBACK = CFUNCTYPE(c_char_p, c_ulong)
MEMORY_ALLOCATION = MEMORY_ALLOCATION_CALLBACK(memory_allocation)

Remarks

The parameter to this function is the amount of memory that should be allocated.

The memory allocation callback creates a buffer for accepting the formatted file from Artistic Style. The procedures are different for Python and Iron Python.

Global instructions following the function define the arguments for the callback function used by AStyleMain.

Error Handler Callback

# AStyle Error Handler Callback

def error_handler(num, err):
    """ AStyle callback error handler.
        The return error string (err) is always byte type.
        It is converted to unicode for Python 3.
    """
    print("Error in input {0}".format(num))
    if __is_unicode__:
        err = err.decode()
    error(err)

# global to create the error handler callback function
if os.name == "nt":
    ERROR_HANDLER_CALLBACK = WINFUNCTYPE(None, c_int, c_char_p)
else:
    ERROR_HANDLER_CALLBACK = CFUNCTYPE(None, c_int, c_char_p)
ERROR_HANDLER = ERROR_HANDLER_CALLBACK(error_handler)

Remarks

The error handler callback displays the error message. For Python 3 and Iron Python, the message is decoded to Unicode. Since the error messages are in ASCII an exception is not handled. It should display an error message and then either abort or continue the program depending on the error. The first parameter is a number identifying the error. The second parameter is a pointer to a standard error message.

Error messages numbered 100-199 are errors that prevent the file from being formatted. A NULL pointer is returned to the calling program. Error messages numbered 200-299 are errors that do NOT prevent the file from being formatted. A valid pointer and a formatted file are returned. This will occur if an invalid option is sent to AStyleMain. The calling program has the option of accepting or rejecting the formatted file.

Global instructions following the function define the arguments for the callback function used by AStyleMain.

AStyleGetVersion Function Call

This function is called to get the Artistic Style version number.

As with calling the format source function, this also requires using the Python ctypes foreign function library. The entire function is shown so you can see the setup for calling Artistic Style. The actual call to Artistic Style is:
               version = astyle_version()

Syntax

def get_astyle_version(libc):
    """ Get the version number from the AStyle shared object (DLL).
        The AStyle return value is always byte type.
        It is converted to unicode for Python 3.
        Since the version is ascii the decoding will not cause an exception.
    """
    astyle_version = libc.AStyleGetVersion
    astyle_version.restype = c_char_p
    version = astyle_version()
    if __is_unicode__:
        version = version.decode('utf-8')
    return version

Return Value

The return type is declared as c_char_p before the function call.

The return value is the byte string containing the Artistic Style version number.

Remarks

For Python 3 and Iron Python, the formatted_text is decoded to Unicode. Since it is ASCII characters only, no exceptions should occur.

Example Unicode

The following example will work with Python 2, Python 3, or Iron Python. For Python 3 and Iron Python, it reads the source code disk files as Unicode and decodes and encodes them as needed. The script can be copied and pasted into an editor for execution. Or it can be downloaded with test data from the "Developer Information" page. The Artistic Style source code must be compiled as a shared library (DLL) using the option ASTYLE_LIB. The shared library must then be copied to the directory that contains the Python script. The directory of the source code files to be formatted is a relative path in the function process_files. This may need to be changed to reflect your directory structure.

 

        
#! /usr/bin/python3

""" ExampleUnicode.py
    This program calls the Artistic Style DLL to format the AStyle source files.
    The Artistic Style DLL must be in the same directory as this script.
    The Artistic Style DLL must have the same bit size (32 or 64) as the Python executable.
    It will work with either Python version 2 or 3 (unicode).
    For Python 3 the files are converted to Unicode and encoded or decoded as needed.
"""

# to disable the print statement and use the print() function (version 3 format)
from __future__ import print_function

import os
import platform
import sys
from ctypes import *

# global variables ------------------------------------------------------------

# will be updated from the platform properties by initialize_platform()
__is_iron_python__ = False
__is_unicode__ = False

# -----------------------------------------------------------------------------

def main():
    """ Main processing function.
    """
    files = ["ASBeautifier.cpp", "ASFormatter.cpp", "astyle.h"]
    options = "-A2tOP"

    # initialization
    print("ExampleUnicode {} {} {}".format(platform.python_implementation(),
                                           platform.python_version(),
                                           platform.architecture()[0]))
    initialize_platform()
    libc = initialize_library()
    version = get_astyle_version(libc)
    print("Artistic Style Version {}".format(version))
    # process the input files
    for file_path in files:
        file_path = get_project_directory(file_path)
        text_in = get_source_code(file_path)
        # unicode must be encoded to utf-8 bytes
        # encoding to utf-8 will not cause an exception
        # IronPython must be explicitely converted to bytes???
        if __is_unicode__ or __is_iron_python__:
            text_in = bytes(text_in.encode('utf-8'))
            options_in = bytes(options.encode('utf-8'))
        else:
            options_in = bytes(options)
        formatted_text = format_source_code(libc, text_in, options_in)
        # if an error occurs, the return is a type(None) object
        if formatted_text is None:
            error("Error in formatting " + file_path)
        # unicode must be decoded from utf-8 bytes
        # decoding from utf-8 will not cause an exception
        if __is_unicode__:
            formatted_text = formatted_text.decode('utf-8')
        save_source_code(formatted_text, file_path)
        # allocated memory is deleted here, not in the allocation function
        del formatted_text
        print("Formatted {}".format(file_path))

# -----------------------------------------------------------------------------

def error(message):
    """ Error message function for this example.
    """
    print(message)
    print("The program has terminated!")
    os._exit(1)

# -----------------------------------------------------------------------------

def format_source_code(libc, text_in, options):
    """ Format the text_in by calling the AStyle shared object (DLL).
        The variable text_in is expected to be a byte string.
        The return value is a byte string.
        If an error occurs, the return value is a NoneType object.
    """
    astyle_main = libc.AStyleMain
    astyle_main.restype = c_char_p
    formatted_text = astyle_main(text_in,
                                 options,
                                 ERROR_HANDLER,
                                 MEMORY_ALLOCATION)
    return formatted_text

# -----------------------------------------------------------------------------

def get_astyle_version(libc):
    """ Get the version number from the AStyle shared object (DLL).
        The AStyle return value is always byte type.
        It is converted to unicode for Python 3.
        Since the version is ascii the decoding will not cause an exception.
    """
    astyle_version = libc.AStyleGetVersion
    astyle_version.restype = c_char_p
    version = astyle_version()
    if __is_unicode__:
        version = version.decode('utf-8')
    return version

# -----------------------------------------------------------------------------

def get_library_name():
    """ Get an astyle shared library name in the current directory.
        This will get any version of the library in the directory.
        Usually a specific version would be obtained, in which case a constant
        could be used for the library name.
    """
    # "cli" may be an IronPython bug???
    if platform.system() == "Windows" or platform.system() == "cli":
        libext = ".dll"
    elif platform.system() == "Linux":
        libext = ".so"
    elif platform.system() == "Darwin":
        libext = ".dylib"
    else:
        error("Cannot indentify platform: " + platform.system())
    # IronPython needs the '.'
    for file_name in os.listdir('.'):
        if (os.path.isfile(file_name)
                and libext in file_name.lower()
                and (file_name.lower().startswith("astyle")
                     or file_name.lower().startswith("libastyle"))):
            return file_name
    error("Cannot find astyle native library in " + os.getcwd() + os.path.sep)

# -----------------------------------------------------------------------------

def get_project_directory(file_name):
    """ Find the directory path and prepend it to the file name.
        The source is expected to be in the "src-p" directory.
        This may need to be changed for your directory structure.
    """
    file_path = sys.path[0]
    end = file_path.find("src-p")
    if end == -1:
        error("Cannot find source directory " + file_path)
    file_path = file_path[0:end]
    file_path = file_path + "test-data" + os.sep + file_name
    return file_path

# -----------------------------------------------------------------------------

def get_source_code(file_path):
    """ Get the source code (unicode in Version 3).
        Opening the file as non-binary will read it as a unicode string.
        An exception is handled in case the file cannot be decoded using
        the system default codec.
        The return value is a unicode string.
    """
    # version 3 will read unicode since the file is not declared as binary
    # could also read the file as binary and use an explicit decode
    try:
        file_in = open(file_path, 'r')
        text_in = file_in.read()
    except IOError as err:
        # "No such file or directory: <file>"
        print(err)
        error("Cannot open " + file_path)
    except UnicodeError as err:
        # "'<codec>' codec can't decode byte 0x81 in position 40813: <message>"
        print(err)
        error("Cannot read " + file_path)
    file_in.close()
    return text_in

# -----------------------------------------------------------------------------

def initialize_library():
    """ Set the file path and load the shared object (DLL).
        Return the handle to the shared object (DLL).
    """
    # change directory to the path where this script is located
    pydir = sys.path[0]
    # remove the file name for Iron Python
    if pydir[-3:] == ".py":
        pydir = os.path.dirname(sys.path[0])
    os.chdir(pydir)
    # return the handle to the shared object
    if os.name == "nt":
        libc = load_windows_dll()
    else:
        libc = load_linux_so()
    return libc

# -----------------------------------------------------------------------------

def initialize_platform():
    """ Check the python_implementation and the python_version.
        Update the global variables __is_iron_python__ and __is_unicode__.
    """
    global __is_iron_python__, __is_unicode__
    if platform.python_implementation() == "CPython":
        if platform.python_version_tuple()[0] >= '3':
            __is_unicode__ = True
    elif platform.python_implementation() == "IronPython":
        __is_iron_python__ = True
        __is_unicode__ = True

# -----------------------------------------------------------------------------

def load_linux_so():
    """ Load the shared object for Linux platforms.
        The shared object must be in the same folder as this python script.
    """
    shared_name = get_library_name()
    shared = os.getcwd() + os.path.sep + shared_name
    try:
        libc = cdll.LoadLibrary(shared)
    except OSError as err:
        # "cannot open shared object file: No such file or directory"
        print(err)
        error("Cannot find " + shared)
    return libc

# -----------------------------------------------------------------------------

def load_windows_dll():
    """ Load the dll for Windows platforms.
        The shared object must be in the same folder as this python script.
        An exception is handled if the dll bits do not match the Python
        executable bits (32 vs 64).
    """
    dll_name = get_library_name()
    dll = os.getcwd() + os.path.sep + dll_name
    if __is_iron_python__:
        try:
            libc = windll.LoadLibrary(dll)
        # exception for IronPython
        except OSError as err:
            print("Cannot load library", dll)
            error("Library is not available or you may be mixing 32 and 64 bit code")
        # exception for IronPython
        # this sometimes occurs with IronPython during debug
        # rerunning will probably fix
        except TypeError as err:
            error("TypeError - rerunning will probably fix")
    else:
        try:
            libc = windll.LoadLibrary(dll)
        # exception for CPython
        except WindowsError as err:
            # print(err)
            if err.winerror == 126:     # "The specified module could not be found"
                error("Cannot load library " + dll)
            elif err.winerror == 193:   # "%1 is not a valid Win32 application"
                print("Cannot load library " + dll)
                error("You may be mixing 32 and 64 bit code")
            else:
                error(err.strerror)
    return libc

# -----------------------------------------------------------------------------

def save_source_code(text_out, file_path):
    """ Save the source code as bytes.
        The variable text_out is Unicode in Python 3.
        The text_out will be encoded to a byte string using the default codec.
        An exception is handled in case the file cannot be encoded.
    """
    # remove old .orig, if any
    backup_path = file_path + ".orig"
    if os.path.isfile(backup_path):
        os.remove(backup_path)
    # rename original to backup
    os.rename(file_path, backup_path)
    # version 3 will encode the file from unicode using the default codec
    # could also use an explicit decode before wiiting the file
    file_out = open(file_path, 'w')
    try:
        file_out.write(text_out)
    except UnicodeError as err:
        # "'<codec>' codec can't encode characters in position 0-2: <message>"
        print(err)
        error("Cannot write " + file_path)
    file_out.close()

# -----------------------------------------------------------------------------

# // astyle ASTYLE_LIB declarations
# typedef void (STDCALL *fpError)(int, char*);       // pointer to callback error handler
# typedef char* (STDCALL *fpAlloc)(unsigned long);   // pointer to callback memory allocation
# extern "C" EXPORT char* STDCALL AStyleMain(const char*, const char*, fpError, fpAlloc);
# extern "C" EXPORT const char* STDCALL AStyleGetVersion (void);

# -----------------------------------------------------------------------------

# AStyle Error Handler Callback

def error_handler(num, err):
    """ AStyle callback error handler.
        The return error string (err) is always byte type.
        It is converted to unicode for Python 3.
    """
    print("Error in input {}".format(num))
    if __is_unicode__:
        err = err.decode()
    error(err)

# global to create the error handler callback function
if os.name == "nt":
    ERROR_HANDLER_CALLBACK = WINFUNCTYPE(None, c_int, c_char_p)
else:
    ERROR_HANDLER_CALLBACK = CFUNCTYPE(None, c_int, c_char_p)
ERROR_HANDLER = ERROR_HANDLER_CALLBACK(error_handler)

# -----------------------------------------------------------------------------

# AStyle Memory Allocation Callback

# global memory allocation returned to artistic style
# must be global for CPython, not a function attribute
# did not try a class attribute but it may not work for CPython
# IronPython doesn't need global, but it doesn't hurt
ALLOCATED = c_char_p

def memory_allocation(size):
    """ AStyle callback memory allocation.
        The size to allocate is always byte type.
        The allocated memory MUST BE FREED by the calling function.
    """
    global ALLOCATED
    # ctypes for IronPython do NOT seem to be mutable
    # using ctype variables in IronPython results in a
    # "System.AccessViolationException: Attempted to read or write protected memory"
    # IronPython must use create_string_buffer()
    if __is_iron_python__:
        ALLOCATED = create_string_buffer(size)
        return ALLOCATED
    # ctypes for CPython ARE mutable and can be used for input
    # using create_string_buffer() in CPython results in a
    # "TypeError: string or integer address expected instead of c_char_Array"
    # CPython must use c_char_Array object
    else:
        arr_type = c_char * size    # create a c_char array
        ALLOCATED = arr_type()      # create an array object
        return addressof(ALLOCATED)

# global to create the memory allocation callback function
if os.name == "nt":
    MEMORY_ALLOCATION_CALLBACK = WINFUNCTYPE(c_char_p, c_ulong)
else:
    MEMORY_ALLOCATION_CALLBACK = CFUNCTYPE(c_char_p, c_ulong)
MEMORY_ALLOCATION = MEMORY_ALLOCATION_CALLBACK(memory_allocation)

# -----------------------------------------------------------------------------

# make the module executable
if __name__ == "__main__":
    main()
    os._exit(0)